ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Multiple URLs for the Same Article Issue

November 17, 2014 1:05pm

Subscribe [1]
  • #1 / Nov 17, 2014 1:05pm

    repulsion

    2 posts

    Hi folks. Not only hoping that someone here can help, but also that I can explain our problem as succinctly and clearly as possible without sounding too much like a noob. The programmer who worked with us on our ExpressionEngine site is no longer with us, and while he said this is an unsolvable situation, given his history of too-easily throwing his hands in the air, I’m looking for a second option.

    Okay, so, the blog entries on our site have the following URL structure (or rather, they post with the following URL structure):

    http://www.nameofoursite.com/blog/2014/17/this-is-the-title-of-our-blog-entry

    In our .htaccess file we have the following:

    RewriteRule ^blog/[0-9]{4}/[0-9]{2}/(.*)$ blog/article/$1 [PT,L]

    Reason for that rewriterule is, according to our old programmer, “The reason for this is because EE doesn’t support the date formatting of the URL. The line basically just takes the year and date and strips it out of the URL and then sends the correct URL to EE.”

    Problem is that Google is crawling (or at least Google Webmaster Tools is picking up) on up to seven different URLS for every article. For example:

    /blog/2014/15/ this-is-the-title-of-our-blog-entry
    / blog /2014/04/ this-is-the-title-of-our-blog-entry
    / blog /2014/07/ this-is-the-title-of-our-blog-entry
    / blog /2014/04/ this-is-the-title-of-our-blog-entry
    / blog /2014/03/ this-is-the-title-of-our-blog-entry

    We obviously want to prevent that—to have ONE URL for each blog entry on the site. But apparently that’s not possible? Or so says our ex-programmer: ” The way EE works.  The URL structure is /blog/article/<title> This URL is the *actual* URL that is always seen by EE. The URL structure /blog/<year>/<day>/<title> is re-written by .htaccess mod_rewrite ‘behind the scenes’ so a person visiting the website *sees* /blog/<year>/<day>/<title> but EE *sees* /blog/article/<title>. Because EE knows nothing of the /<year>/<day>/ part of the URL, there is no way to stop /<any_year>/<any_day>/<title> from returning the article associated with <title>”

    So, in short…is he right, or is there actually a way around this? Either via .htaccess or some kind of plugin?

    Our ex-programmer seems to think a 301 redirect can work here, that “the /<any_year>/<any_day>/ URLS will still work BUT when google crawls them it will only see house/article/<title> (because this is the *real* URL)”…but I was hoping for something a little more straightforward.

    Any thoughts/leads would be greatly appreciated. Thanks!

  • #2 / Nov 17, 2014 3:25pm

    repulsion

    2 posts

    Well, folks, after some sleuthing, I think we found a workaround. Not ideal, but it works for us.

    Adding the following line to our .htaccess…

    RewriteRule ^blog/[0-9]{4}/[0-9]{2}/(.*)$ /blog/article/$1 [R=301,L]

    ...ensures that ALL numbered URLs will redirect to the /blog/article/ format and, fingers-crossed, Google will only crawl the one URL.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases