I’m trying to figure out a way to keep both users and search engines happy with regards to multilingual sites. In particular I’m looking for a way to get “clean” landing page URLs and still have all the content indexed. This isn’t really a CodeIgniter issue, but more of a general URL structure issue.
My first multilingual CI project will be a rewrite of my own homepage (see signature below). I know it’s a bit over-ambitious to build a multilingual personal website, but I have my eyes on another project that will need similar functionality in the near future so I consider this a testbed.
Currently I evaluate HTTP_ACCEPT_LANGUAGE on initial session creation and store the result in a session variable. I also provide the user with a link on every page to change the default selection (which I persist using a client side cookie).
- http://www.toomuchdata.com/blog.php (language depends on HTTP_ACCEPT_LANGUAGE)
- http://www.toomuchdata.com/blog.php?lang=en (content in English, and future page views using same session will show English content if available)
- http://www.toomuchdata.com/blog.php?lang=sv (content in Swedish, and future page views using same session will show Swedish content if available)
One big problem with this is that each content page can be reached through two different URLs (and duplicate content is bad for SEO).
I would like to preserve the auto-language-selection using HTTP_ACCEPT_LANGUAGE in my rewritten CodeIgniter solution, but at the same time prevent duplicate content so that spiders are happy. (And also keep the URLs as clean and tidy as possible.)
I’d like to avoid always encoding the language in the URL like this:
http://www.example.com/en/blog/
http://www.example.com/sv/blog/
Instead I want the blog landing page to be:
http://www.example.com/blog/
I can then continue doing what I’m doing today with HTTP_ACCEPT_LANGUAGE and also let the user select his/her preferred language. But how do I go about making the content in both languages available to spiders in a way they can understand and index properly?
Does anyone have any suggestion or ideas on how to solve this tricky problem?