I have a site that heavily used the Pages Module. Since we use the pages module to customize many URLs, I want to excluded their long name equivalent that uses template_group/template_name/entry format from being indexed. I have removed the index.php from my URLs using htaccess already.
Does a robots.txt file exclude template_groups in the same way they would exclude a directory? Since they are dynamically created, I am unsure what is preferred practice.
Basically, I want to exclude the template_group page, includes, and any other non-needed folders. Is this the correct method? Do the bots treat the template_groups just like a directory for exclusion? Since the system folder is above my root, are these folders the best once to exclude from the crawlers?
# Robots.txt file
User-agent: *
Disallow: /index.php/
Disallow: /site/404
Disallow: /page/
Disallow: /includes/
Disallow: /images/avatars
Disallow: /images/captchas
Disallow: /images/member_photos
Disallow: /images/pm_attachments
Disallow: /images/signature_attachments
Disallow: /images/smileys
Disallow: /images/uploads
Disallow: /themesThanks, Bryan