1. Glossary
  2. Crawlers
  3. Meta Robots Tags


Robots.txt and meta robots tags

Robots.txt and meta robots tags are used by webmasters and search engine optimisation agencies in order to give instructions to crawlers traversing and indexing a website. They tell the search spider what to do with the specific web page, this may include requesting that the spider does not crawl the page at all or crawls the page but does not include it in Google’s index. It is often a good idea to use them in conjunction with nofollow tags.

What is robots.txt?

Robots.txt, short for The Robots Exclusion Protocol , is a text file created to instruct robots or ‘crawlers’ how to index pages on their website. Search engine optimisation agencies that use robots.txt effectively can instruct crawlers what to visit on a website and give you control over how your site is searched.

These include:

Noindex : This permits the crawling of the page, but not the indexing. It also tells the search engines that, if the page is currently in the index, it should be removed.

Disallow : Disallows all crawling and indexing of the page.

Nofollow : Tells the search engines not to follow links on the page. This is an extremely important to search engine optimisation, and so we have gone into greater detail at Nofollow tags. The reverse directive of this is ‘follow’.

Nocache : Tells the search engines not to save a copy of the web page in their cache.

You can view your site’s robots.txt file by going to: www.yourdomain.com/robots.txt .

What are meta robots tags?

Meta robots tags are used in addition to the robots.txt file to focus on individual pages rather than the website as a whole. Meta robots tags allow you to control the behaviour of search bots at the page level, with a header-level directive. This gives users what Google refers to as ‘fine grain control’ over a website.

In the code it would look like this: