Author: Steven Neiland
Published:

Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

Crawl-delay

Several major crawlers support a Crawl-delay parameter, which determines how long a spider must wait after retrieving a page before it can request another page from the site. This is a good security measure to implement as it reduces the likelyhood of a spider acting like a 'Denial of Service' attack as it prevents your site from being overloaded by a spider suddenly requesting a large number of pages in quick succession.

The crawl-delay directive accepts a value which represents the number of seconds to wait between successive requests to the same server:[1][2][3]

#wait ten seconds between page requests
User-agent: *
Crawl-delay: 10

Sitemap

Another important function of the robots.txt file in particular for SEO is the sitemap directive which tells a webspider where the site map(s) is stored. Example as follows.

Sitemap: http://www.mysite.com/sitemaps/profiles-sitemap xml
Sitemap: http://www.mysite.com/sitemaps/blog-sitemap xml
1 2 3

What Do You Think?

Reader Comments

Post a Comment

Comment Etiquette:

  • Please keep comments on-topic.
  • Please do not post unrelated questions or large chunks of code.
  • Please do not engage in flaming/abusive behaviour.
  • Comments that contain or appear to be advertisments, will not be published.
  • Comments that appear to be created for the purpose of linkbuilding to commercial sites will be removed.

We are all adults here so play nice.

*
*



Archives Blog Listing

Tag Listing

Learn CF In A Week

Treehouse

 
Fork me on GitHub