Yesterday I came across SEO-Hacker’s nice article outlining the advantages and dangers of relying on robots.txt for controlling the crawlers.
Which set me off into finding out ways to optimize this blog template design - after all, who isn’t in need of more readers? And if more visitors can be attracted by simple blogspot template re-design, why not?
:-)
There are 2 simple basic rules. These restrictions avoid the bots from tagging the website content as duplicates and penalizing it.
Sure enough, going into Settings -> Search Preferences -> and clicking Edit on the last option Custom robots.txt tags opened up the options.
To be careful and not mess up the crawlers too badly, I decided to do exactly what I came in for. Prevent crawlers like Googlebot from indexing the ‘Archive and Search’ pages. And Google Blogger settings has exactly that option, apart from the ones for ‘HomePage’ and ‘Defaults for Posts and Pages’.
It might be a good idea to switch on ‘nofollow’ tags as default under all the 3 sections, but my understanding isn’t still clear and would welcome any suggestions.
From what I understand, the ‘nofollow’ tag prevents the crawlers from following any links in my website. If I don’t use nofollow, then whatever PageRank my blog has will be distributed among those links.
OTOH, I do want other ‘important’ pages of my blog to accumulate PR. For e.g., this blog’s homepage PR is 3. But AFAIK, none of the individual post pages have any PR at all.
Another confusing issue is the Google Authorship tags. If I use ‘nofollow’ tags as default for all pages, will the Google Authorship tagging be affected? Google Authorship only works when the content is ‘linked’ to my Google+ profile (and vice-versa).
:-P
Update: As of today the Googlebot had crawled this blog on Mar 16, well before (just on?) Panda update #25 and before this tweak. Let me see what this tweak does with the Googlebot.
BTW, looking through the blog design template, I found a meta tag inserted sometime back to take care of noindex issue on ‘archive’ pages. Note it says ‘follow’ and not ‘nofollow’. Deleted the meta tag now of course!
Which set me off into finding out ways to optimize this blog template design - after all, who isn’t in need of more readers? And if more visitors can be attracted by simple blogspot template re-design, why not?
:-)
There are 2 simple basic rules. These restrictions avoid the bots from tagging the website content as duplicates and penalizing it.
- Do not allow crawlers like Googlebot to index the search results.
- Do not allow crawlers to index the archive pages.
Sure enough, going into Settings -> Search Preferences -> and clicking Edit on the last option Custom robots.txt tags opened up the options.
To be careful and not mess up the crawlers too badly, I decided to do exactly what I came in for. Prevent crawlers like Googlebot from indexing the ‘Archive and Search’ pages. And Google Blogger settings has exactly that option, apart from the ones for ‘HomePage’ and ‘Defaults for Posts and Pages’.
- check the ‘all’ in both home and posts/pages sections. That is allow the crawlers to index and follow everything on these pages.
- check *only* the ‘noindex’ checkbox in the archive & search pages section.
- check the ‘noodp’ checkbox in all the 3 sections.
It might be a good idea to switch on ‘nofollow’ tags as default under all the 3 sections, but my understanding isn’t still clear and would welcome any suggestions.
From what I understand, the ‘nofollow’ tag prevents the crawlers from following any links in my website. If I don’t use nofollow, then whatever PageRank my blog has will be distributed among those links.
OTOH, I do want other ‘important’ pages of my blog to accumulate PR. For e.g., this blog’s homepage PR is 3. But AFAIK, none of the individual post pages have any PR at all.
Another confusing issue is the Google Authorship tags. If I use ‘nofollow’ tags as default for all pages, will the Google Authorship tagging be affected? Google Authorship only works when the content is ‘linked’ to my Google+ profile (and vice-versa).
:-P
Update: As of today the Googlebot had crawled this blog on Mar 16, well before (just on?) Panda update #25 and before this tweak. Let me see what this tweak does with the Googlebot.
BTW, looking through the blog design template, I found a meta tag inserted sometime back to take care of noindex issue on ‘archive’ pages. Note it says ‘follow’ and not ‘nofollow’. Deleted the meta tag now of course!
your method is good but i think my method is easy to implement check out here http://chillofyblogging.blogspot.in/2013/06/how-to-setup-custom-robotstxt-in.html
ReplyDeletehi @rathod. You have a nice blog and a nice point. Unfortunately, it won't work for custom domain hosted on blogger. Here is my post on that :
ReplyDeletehttp://www.madmadrasi.net/2013/06/sitemap-xml-in-blogger-for-better-seo.html
Hi its quite nice blogger to read . Itz good to hear about custom blogspot robotic design . Very useful and keep bloging Robotics training in Chennai
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete