Sunday, March 9, 2008

Robots.txt

Robots.txt is file which kept in main folder of site .. like e.g. http://www.yoursite.com/robots.txt
File is used for blocking and unblocking access of engines or robots to any file or folder. You can find
more info about it on http://www.robotstxt.org
Still for the sake of users i will define with more details :
Use any text editor like notepad and save as robots.txt, Now write into it as per rules..
e.g.

User-agent: *
Allow: /
This thing allows all user agents to access all files and folders...
User-agent: * = means to all users robots or.. all user-agents or instead of * you can use any name of useragent.. like Opera , MSIE or FireFox..

Next thing is select whether to allow or disallow that useragent to access it or not.. like our e.g.
allows all files..
Here is example which will not allow access to search folder
User-agent: *
Disallow: /search

Keeping this file is not compulsion but still keep it for sake of some BOTS (Robots)


Next part of this post is blocking a link by other method..i.e. when you have some
link over your site then you can block bot or robot to not follow it.
e.g. <a href="http://www.google.com/webmaster/" rel="nofollow">Test link</a>

The rel='nofollow' says that don't follow link.. By this way you can block bots too..
Using this attribute doesn't mean that bot cant access the link.. They can access it
but they wont cache if they don't want..

If you have any questions do post or Either you can just post thanks so that i can
guess how much you are interested.

Thanks




No comments: