Websites use a file called /robots.txt file to tell web robots what they can and can't trawl.


If a robot wants to visit a website, for example http://www.my-website.com/helloworld.html. It must first check http://www.my-website.com/robots.txt


The following site has a very good guide about how to put together a robots.txt file

http://www.robotstxt.org/robotstxt.html