I’ve noticed that my site’s Google results contain many completely invalid URLs. For example:
/weblog/Ben Johnson/All the abortion lies fit to print/Virtual Ayn Rand to run for U.S. presidency/Hollywood.com/P340This URL contains no valid template name and there is no entry with a url title of “Ben Johnson”. I have no idea how this happens, but that’s not my topic today! I just want to return a 404 for these invalid URLs so they don’t get indexed.
I have tried putting this at the top of the index page for the weblog template group:
{exp:channel:entries channel="mychannel" limit="1" require_entry="yes" }
{if no_results}
{redirect="404"}
{/if}
{/exp:channel:entries}This works great except for the slight problem that it returns a 404 for pagination (site.com/[weblog/]P20) and archive (site.com/[weblog/]2011/08) URLs.
Perhaps I can resolve the issue with archives by using a different template instead of index.html (though I’d hate to break all my URLs—the site’s been up for almost 7 years). But I don’t see how to allow pagination to work.
Can anyone point me in the right direction?
Thanks,
Mark