Thanks to Lisa’s previous help this evening, I’ve been working on making my site more SEO friendly. Currently, I’m trying to setup my robots.txt file to eliminate links to redundant content (archives, etc).
I suppose it’s not the end of the world, but I’d really like to prevent Googlebot from indexing the P5, P10, etc. links since it will get the same content from the single entry pages anyhow.
One thought would be to setup my robots.txt as follows:
User-Agent: *
Disallow: /dnd/archives/
Disallow: /search/
Disallow: /P5/
Disallow: /P10/
Disallow: /P15/As I have hundreds of posts, this would be a long robots.txt that I’d have to manually update.
Another idea would be to somehow write an if statement around the segments and add a meta tag for that page with the don’t index info in it. I think this might work:
{if segment_1 !="dnd"}<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">{/if}With the above code in the HTML header, I think it would catch everything that didn’t have the template group specified (dnd) in the URL (for that template) which should just be the pagination URLs (P5,P10, etc).
EDIT: That way wouldn’t work as it would prevent Google from indexing the home page of the site. Doh!
Is there another way?
BTW - I just love EE. :D