Presentation of the robots.txt file
The file robots.txt is a text file used for SEO websites, containing commands to crawlers of search engines to determine their pages that may or may not be indexed. Thus any search engine begins crawling a website by searching the file robots.txt at the root of the site.
robots.txt file format
The file robots.txt (Written in lower case and plural) is an ASCII file located in the root of the site and may contain the following commands:
- User-Agent: clarifies the robot concerned with the following guidelines. The value * means "all search engines".
- Disallow: to specify which pages to exclude from indexing. Each page or path to be excluded must be on a separate line and must begin with /. The value / only means "all pages of the site".
Here are sample files robots.txt :
- Exclusion of all pages:
- Exclusion of any page (equivalent to the absence of file robots.txt, All pages are visited):
- Authorization of a single robot:
- Excluding a robot:
- Excluding a page:
- Exclusion of several page:
- Exclusion of all pages of a directory and its subdirectories:
Disallow: / directory /
Examples of User Agents for the most popular search engines:
For more information
- Improve the crawlers crawl
- The web robots page
To go deeper
- SEO training
download this article (PDF