TIME NOW
World current time now,
CALENDAR
Calendar monthly, yearly
login CONVERT LENGTH
login CONVERT TEMPERATURE
login DICTIONARIES, LISTS
login SCIENCE EDUCATION RELIGION
login WORK CALCULATOR
login CALCULATE LIFE

Internet. programming, web - news, blog, articles

Previous articlePage bottomNext article  ALL TOPICS

SEO. Crawl delay and the Bing crawler

 

Crawl delay and the Bing crawler, MSNBot - 2009.10.08

Delay crawling frequency in the robots.txt file

Bing supports the directives of the Robots Exclusion Protocol (REP) as listed in a site’s robots.txt file, which is stored at the root folder of a website. The robots.txt file is the only valid place to set a crawl-delay directive for MSNBot.

The robots.txt file can be configured to employ directives set for specific bots and/or a generic directive for all REP-compliant bots. Bing recommends that any crawl-delay directive be made in the generic directive section for all bots to minimize the chance of code mistakes that can affect how a site is indexed by a particular search engine.

Note that any crawl-delay directives set, like any REP directive, are applicable only on the web server instance hosting the robots.txt file.

How to set the crawl delay parameter

In the robots.txt file, within the generic user agent section, add the crawl-delay directive as shown in the example below:

User-agent: *
Crawl-delay: 1

Note: If you only want to change the crawl rate of MSNBot, you can create another section in your robots.txt file specifically for MSNBot to set this directive. However, specifying directives for individual user agents, in addition to using the generic set of directives, is not recommended. This is a common source of crawling errors as sections dedicated to specific user agent directives are often not updated with those in the generic section. An example of a section for MSNBot would look like this:

User-agent: msnbot
Crawl-delay: 1

The crawl-delay directive accepts only positive, whole numbers as values. Consider the value listed after the colon as a relative amount of throttling down you want to apply to MSNBot from its default crawl rate. The higher the value, the more throttled down the crawl rate will be.

Bing recommends using the lowest value possible, if you must use any delay, in order to keep the index as fresh as possible with your latest content. We recommend against using any value higher than 10, as that will severely affect the ability of the bot to effectively crawl your site for index freshness.

Think of the crawl delay settings in these terms:

 

Crawl-delay setting

Index refresh speed

No crawl delay set

Normal

1

Slow

5

Very slow

10

Extremely slow

Feedback

The Bing team is interested in your feedback on how the bot is working for your site, and if you decide a crawl delay is needed, which setting works best for you for getting your content indexed while not seeing an unreasonable impact upon the web server traffic. We want to hear from you so we can improve how the bot works in future development.

If you have any questions, comments, feedback, or suggestions about the MSNBot, feel free to post them in our Crawling/Indexing Discussion forum. There’s another SEM 101 post coming soon. Until then…

-- Rick DeJarnette, Bing Webmaster Center

Robots Exclusion Protocol directives precedence

We want webmasters to know that bingbot will still honor robots.txt directives written for msnbot, so no change is required to your robots.txt file(s).

Please note, however, that if we detect separate sets of directives for bingbot and for any of the older versions of Microsoft search bots (such as msnbot) or a set of directives for all crawlers, the directives for bingbot will take precedence. For example, in the following case, Bing will be authorized to crawl on all whole hosts except on the folder /folder1/, despite the more comprehensive blocking directives for other crawlers:

User-agent: bingbot
Disallow: /folder1/

User-agent: msnbot
Disallow: /folder1/
Disallow: /folder2/

User-agent: *
Disallow: /

Dissallow some parts

Here's what the robots.txt file looks like.
User-agent: *
Disallow: /*?*count=
Disallow: /*?*last=
Disallow: /*?*page=
Disallow: /*?*sortorder=
 
Previous articlePage topNext article  ALL TOPICS



 Use username: Guest, Anonymous, Programmer






QUOTES:
I'd rather ski naked than ski without gloves.
Snoop
Let the minor genius go his light way and enjoy his life
the great nature cannot so live, he is never really in holiday mood, even though he often plucks flowers by the wayside and ties them into knots and garlands like little children and lays out on a sunny morning.
There is no wealth but life.
John Ruskin