Friday, April 16, 2010

Why Are Restricted By Robots.Txt Always Appears In Blogger's Blogspot?

Please note that Googlebot (Google robot for crawling, also known as Spider) must crawl billions pages on the web. With the advancement of technology and resources they have, it becomes very likely be completed by Google in order to realize the mission of Google, which in this case performed by the Googlebot.

But not necessarily, billions of pages on the web, can be indexed by Google within a short and quick time. Googlebot needs Efficiency and Effectiveness for crawling. Googlebot will not crawl on the same page with different URL address, For example,

http://yourblogname.blogspot.com/search/label/firstlabel might point to the same content as http://yourblogname.blogspot.com/search/label/secondlabel.

This explanation, actually is the answer to the question often asked by Blogspot's bloggers, that is: Why are Restricted by robots.txt always appears in Blogger's Blogspot? (on Webmaster tools - Diagnostics - Crawl errors report).

Blogger is a blogging platform that has the advantage, automatically indexed in Google. But unlike WordPress which is self-hosted, Blogger or Blogspot is hosted by the second hand or not by You. Next, Blogger has a way to classify various topics by adding a feature that is, Labels.

By using Labels, a Blogspot's blogger can categorize or classify each post with a separate topic. However, if a post did not have a specific topic, then its post would be very likely to have more than one topic or label. What happens next is, there will be two or more different directory or URL address that leads to a page of the same post.

And to handle the matter or issue, Blogger by default or automatically add the Robots.txt file for each post pages that have more than one label or topic.

Furthermore, this issue will appear on Crawl errors report on Google webmaster tools. Where in fact, it is not a problem that needs to worry about by Blogspot's blogger. Therefore, the post pages will still be indexed by Google, it's just that, Googlebot will requires more time to crawl the post pages with more than one label, where, Googlebot will first identify, where post pages which are Restricted by Robots.txt file.

Default Restrictions or Robots.txt for Blogger's Blogspot

For example:

You have a blog with a post titled: How to Play Football, it is very likely that this post will have two topics or label, namely: Playing Football (firstlabel) and Football (secondlabel). And the result is two URL address just like below:
  1. http://yourblogname.blogspot.com/search/label/firstlabel
  2. http://yourblogname.blogspot.com/search/label/secondlabel
So it is clear that, the two URL address above lead to the same post page. And, Blogger will make restrictions on one of its URL addresses.

19 comments:

  1. thanks dude...ur post really relaxed mee...i have a lot of post on my blog wwequeens.blogspot.com,but 25% of the post url are restricted by robot.txt so m getting worried

    ReplyDelete
  2. Thanks a lot, it makes sense now. Im not worried about my indexing :)

    ReplyDelete
  3. thanks buddy, I was worried what did i do wrong there.

    ReplyDelete
  4. Yes I was pretty worries about it as well but thanks to your article I am feeling relaxed now.

    ReplyDelete
  5. Thanks for Info ! You Relieved me I have 465 Urls Restricted by Robots.txt! Information4all

    ReplyDelete
  6. Thanks dude... I was really worried about this. Now I am relaxed. Thanks again.

    ReplyDelete
  7. Thanks!!! This this article is very helpful. Now, I never need to worry.

    ReplyDelete
  8. Thanks for Info ! You Relieved me I have 1751 Urls Restricted by Robots.txt!

    ReplyDelete
  9. Great and amazing knowledge by I have same problem with my web http://jobzfree.blogspot.com

    ReplyDelete
  10. Thanks dear, I also had the same feelings about my blogs restricted links in
    digicartoon.blogspot.com and
    techdiag.blogspot.com

    Good Luck!

    ReplyDelete
  11. Thanks for sharing this. I had 517 errors and I was really worried.

    ReplyDelete
  12. very helpful, thank you

    ReplyDelete
  13. Excellent pieces. Keep posting such kind of information on your blog. I really impressed by your blog.

    ReplyDelete
  14. thanks for giving information i was very worry but it seems to not be so now i will not thing about it and keep posting my posts in http://www.deluxepages.blogspot.com

    ReplyDelete
  15. nice post my blog is have above 50 urls like that..
    www.cinerak.blogspot.com

    ReplyDelete
  16. thanks mz worrries are gone too :D

    ReplyDelete