ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

What is best method to exclude template groups and templates from indexing with robots.txt?

December 14, 2011 4:08pm

Subscribe [2]
  • #1 / Dec 14, 2011 4:08pm

    bgarrant

    356 posts

    I have a site that heavily used the Pages Module. Since we use the pages module to customize many URLs, I want to excluded their long name equivalent that uses template_group/template_name/entry format from being indexed.  I have removed the index.php from my URLs using htaccess already.

    Does a robots.txt file exclude template_groups in the same way they would exclude a directory?  Since they are dynamically created, I am unsure what is preferred practice.

    Basically, I want to exclude the template_group page, includes, and any other non-needed folders.  Is this the correct method?  Do the bots treat the template_groups just like a directory for exclusion?  Since the system folder is above my root, are these folders the best once to exclude from the crawlers?

    # Robots.txt file
    
    User-agent: *
    Disallow: /index.php/
    Disallow: /site/404
    Disallow: /page/  
    Disallow: /includes/ 
    Disallow: /images/avatars
    Disallow: /images/captchas
    Disallow: /images/member_photos
    Disallow: /images/pm_attachments
    Disallow: /images/signature_attachments
    Disallow: /images/smileys
    Disallow: /images/uploads
    Disallow: /themes

    Thanks, Bryan

  • #2 / Dec 15, 2011 10:12am

    bgarrant

    356 posts

    What standard items does everyone disallow?  Trying to get a good robots.txt structure to use on many sites.

  • #3 / Dec 16, 2011 12:41pm

    bgarrant

    356 posts

    Can anyone shed some light on this?

  • #4 / Dec 16, 2011 5:42pm

    e-man

    1816 posts

    Not sure how much of an issue this is, but if you want to hide your include templates you can make them a hidden template:
    http://ellislab.com/expressionengine/user-guide/templates/hidden_templates.html

  • #5 / Dec 16, 2011 6:34pm

    bgarrant

    356 posts

    I did that e-man. What do you include in your robots.txt file for exclusion? Just curious.

  • #6 / Dec 16, 2011 6:38pm

    e-man

    1816 posts

    I’m going to be honest here and say I never bother with it.

  • #7 / Dec 16, 2011 6:53pm

    bgarrant

    356 posts

    I am also trying to get some recommendations on template setup. I typically have the following:

    Site - contains main index, stylesheet, 404, sitemap
    Pages - all other static type page templates
    News - index page for multiple entries and detail template
    Includes - any embeds

    The problem is the URL is so long: http://domainname.com/pages/template-name/entry

    I am trying to find a way to shorten URLs without using Page Module as it is to hard for clients to grasp. Any suggestions? At way to better layout template groups? What type setup do you typically use?

  • #8 / Dec 16, 2011 7:01pm

    e-man

    1816 posts

    I never use the Page module, but isn’t it designed to completely bypass the template_group/template paradigm?

  • #9 / Dec 16, 2011 7:04pm

    bgarrant

    356 posts

    Yes it does.  It works great, but clients have to manually type in a URL. I wish I could somehow copy the url_title to the pages URI automatically. That would be great since the url_title is auto created.

  • #10 / Dec 16, 2011 7:05pm

    bgarrant

    356 posts

    Yes it does.  It works great, but clients have to manually type in a URL. I wish I could somehow copy the url_title to the pages URI automatically. That would be great since the url_title is auto created.

  • #11 / Dec 16, 2011 7:08pm

    e-man

    1816 posts

    Ah, I see now. Try http://devot-ee.com/add-ons/pages-uri or http://devot-ee.com/add-ons/pages-autocomplete

    Other good pages related add-ons:
    http://devot-ee.com/add-ons/better-pages

  • #12 / Dec 16, 2011 7:12pm

    bgarrant

    356 posts

    Thanks e-man. Better pages rocks. Used it before. I will check out the others.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases