Table of Contents
- Requirements
- Notes & Disclaimers
- Instructions
- Special note for IIS users
- Introduction
- “Include” List Method
- “Exclude” List Method
- “File and Directory Check” Method
- Explanation
- Got it working
- Final Notes
- Removing Template Group and index.php from URL
- .htaccess
- Note on robots.txt
- Control Panel Setup
- Caveats
- Added code to take care of EE CSS under some server setups
- Note on SSL
Requirements
To complete this tutorial you will need:
* A host/server using Apache or IIS for the webserver
* Access to your .htaccess (Apache) file or httpd.conf (IIS) file in your root EE directory
* Mod_rewrite enabled on your server (Apache on Windows or Linux)
* ISAPI_Rewrite enable on your server (IIS for Windows Only!)
Notes & Disclaimers
——-
Alternative: Renaming your index.php file
As an alternative to this tutorial you might also want to look into a elegant
and fairly easy way to rename (or copy) your index.php file to something you like.
Like “gowild” in this example: http://www.example.com/gowild/weblog
see: http://expressionengine.com/docs/installation/renaming_index.html
——
If using IIS, you can accomplish this using ISAPI_Rewrite, see this thread. There is also a free ISAPI Rewrite module (Microsoft Permissive License)and instructions for using it in this thread.
If using Apache on both Windows and Linux you will need to have mod_rewrite module enabled and active.
The code contained in this section was posted on the EE forums, we claim no credit for it and no responsibility for what happens if you choose to use it. Mod_rewrite can be a tricky thing, so if you run into problems, search the forums, and otherwise you’re on your own! :)
Instructions
Special note for IIS users
All of these methods have been tested and work on both IIS and Apache for both mod_rewrite and isapi_rewrite. For IIS you will need to change .htaccess to httpd.conf and it will still remain in your local root EE install directory. Please remember IIS users where you see .htaccess remember it should be httpd.conf. Linux users there is no change in your directions. If the httpd.conf method does not work for you then you will need to contact your hosting provider and ask them what your configurations file name should be.
Introduction
Here’s a simple explanation of what we’re going to do:
To remove “index.php” from your URLs we need to tell the server to parse all files as though they did have “index.php” in the URL, but just not show it to the user. We also need to tell the server to either treat *all* requests this way except those in specified EE Template Groups (exclude method) OR to *only* parse files within certain directories (EE template groups) this way (include method). (A third method, which removes index.php if the file or directory called does not exist, may have ramifications for how your site will be ranked by search engines.)
All three of these methods were developed by the users on the ExpressionEngine forums.
“Include” List Method
RewriteEngine on
RewriteCond $1 ^(weblog|member|search|Forum_Name|TemplateGroup_4_Name|TemplateGroup_5_Name|P[0-9]{2,8}) [NC]
RewriteRule ^(.*)$ /index.php/$1 [L]
With this method you want to include all of the names of your EE Template Groups on the second line. Replace “TemplateGroup_#_Name” with your template groups and “Forum_Name” with the name you’re using for forums (e.g. forums, community, etc).
Don’t include any of your “real” directories on the server. In the future, if you add any new Template Groups then you’ll also need to update your .htaccess file to reflect the new Groups.
The “P[0 -9]{2,8}” code makes sure that pagination links get processed by EE while the “[NC]” makes the RewriteCond case insensitive.
Note: you’ll need to remove the index.php from admin/system preferences/system configuration/url to the root directory of your site.
Note: you’ll need to remove the index.php from Admin > System Preferences > General Configuration > Name of your site’s index page.
If you have files separate to EE that use the same name as template groups within it you may have trouble - e.g. you have slideshow.swf in your root directory and you have added ‘slideshow’ to the 2nd line of the code above. In this instance the path to slideshow.swf will also be rewritten, which will cause problems as your code will be looking for it in a place that it isn’t. Solution: keep names different, or add check in rewrite rules to counter.
If you are driving your page structure off templates within one primary template group, you may need to include reference to them in the 2nd line of code too:
^(weblog|member|search|Forum_Name|TemplateGroup_4_Name|TemplateGroup_5_Name|Template1InPrimaryTemplateGroup|Template2InPrimaryTemplateGroup|P[0-9]{2,8}) [NC]
And your ‘search’ may not work if you still have an index.html page in your site, which you may do if you are still testing and not yet public
Generate .htaccess for “Include” List Method Manually with the weblog:entries tag
In a template, place the following, all by itself. Copy and paste carefully, making sure that you do not have any linebreaks in the second line (RewriteCond all the way to the <br >), or you will have whitespace in your rule that will cause it to break.
All this does is output the code for you to copy and paste into your .htaccess, so put it into any template that is completely blank, then view the template, and c/p it into your .htaccess. This uses the {site_url} variable to rewrite to full URLs, but you can remove that and use a full path beginning with a ‘/’ like the example above if you desire. Webmasters wishing to utilize the “manual” include list method must have a paid license, as it requires the Query module, which Core users do not have.
Note: This method will not generate options for EE pages.
RewriteEngine On<br />
RewriteCond $1 ^(member|{exp:query sql="SELECT group_name FROM exp_template_groups"}{group_name}|{/exp:query}P[0-9]{2,8}) [NC]<br />
RewriteRule ^(.*)$ /index.php/$1 [L]
Generate .htaccess for “Include” List Method Automatically using the LG .htaccess Generator extension.
An alternative and automated way of generating the template groups and pages is LG .htaccess Generator.
LG .htaccess Generator is a Multi-Site Manager compatible extension that automatically generates and updates your site’s .htaccess file every time a weblog entry, template group or template is created or modified.
Using special {ee:} tags LG .htaccess Generator allows you to easily remove your sites index.php file using the “Include List Method”. As a result the “Include List Method” is the only method without drawbacks.
Retain index.html during build.
If you want to use an index.html as a holding page when someone types in http://domain.com/ etc during the build whilst retaining public access on the live site URLs and you’re using the include method then you’ll need to find a way to put it back as otherwise you’ll be redirected to the masked index.php page. Here’s code to do that - put it after RewriteEngine on, but before the include rewrites:
# During build - redirect root to static index page
RewriteCond %{REQUEST_URI} ^/$
RewriteRule $ http://www.domain.com/index.html [R=302,L]
“Exclude” List Method
RewriteEngine on
RewriteCond $1 !^(images|system|themes|favicon\.ico|robots\.txt|index\.php) [NC]
RewriteRule ^(.*)$ /index.php/$1 [L]
This method is naturally the opposite of the “include” method. On the second line you want to include the names of all of your real directories on the server as well as the “index.php” file and any other non-EE files you want to make available. You do not include any of your Template Group names here. At the very least you should have your “images”, “themes”, and “system” directories (or whatever you renamed “system” to) listed. If you use a favicon.ico file it may be necessary to include that as well. If you have other directories at that same level then include them. In the future, if you add any new directories then you’ll also need to update your .htaccess file to reflect the new directories.
“File and Directory Check” Method
RewriteEngine On
RewriteCond $1 !\.(gif|jpe?g|png)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [L]
With this method you will have the server check and see if the file or directory exists that you are trying to call, if it doesn’t then it will parse it as a EE compatible URI (EXP: http://www.mydomain.com/index.php/template_group/template_name/) without index.php, this method allows for extensions and external included applications such as (javascript files) to load correctly whereas older methods such as the “include” and “exclude” methods would not.
In certain server environments, the last line of this method may need to include a question mark after index.php in order to force a URL query:
RewriteRule ^(.*)$ /index.php?/$1 [L]
For users with index.php located in a directory other than the root directory (e.g. <nowiki>http://www.pets.com/subdir1/index.php</nowiki>), it may be necessary to rewrite the final line of code, removing the first ‘/’ mark:
RewriteRule ^(.*)$ index.php/$1 [L]
The one drawback to this method is that image (or script or style sheet) tags that reference missing files will end up calling an EE page, not a 404 page.
The
RewriteCond $1 !\.(gif|jpe?g|png)$ [NC]
line attempts to mitigate this by excluding missing image URLs from being sent to EE.
Important note for SEO: This method will serve all ExpressionEngine pages with a code 200 - which means that Error Code 404 - Not Found will never be delivered, even if EE is set to use it. The other methods may be more SEO friendly, though fiddler to work with. This information is derived from this thread.
Explanation
In each of these ways, you need only to separate your desired directories with the pipe character (looks like this | and usually is keyed Shift + \ ).
There are two differences between the methods:
# The Exclude list has the exclamation mark.
# In addition to the directories/Template Group names, you must put index.php in the exclude list, lest bad things happen. :)
Pick the method most appropriate for your needs, put it in your .htaccess file in your root public html directory (or wherever your EE main site index.php file is; not the one in the “system” directory), and away you go! I use the exclude method because I have far more EE template groups than I do non-EE directories (eg css, images, javascript directories) so its simpler to exclude them. Remember to exclude your system directory too!
Got it working
Try accessing a url on your site without index.php and see if it works. It’s best to try accessing an older entry, because if it’s not working EE will just list all entries with the newest first, which can be confusing. Once you are confident it is working, you need to complete this step so internal URLs are generated without index.php in them (be sure to see the notes below too).
Simply go to Control Panel Home › Admin › System Preferences › General Configuration, and delete the ‘Name of your site’s index page’ value. Leave it completely blank. This will ensure EE-generated URLs don’t contain index.php.
Final Notes
There is one caveat to this method, with a simple workaround. As discussed in this thread, pagination links can place “index.php/” in an inappropriate place in the actual pagination links, effectively breaking them. Thankfully, Lodewijk created a simple Find & Replace plugin to remove “index.php/” from the url (thanks Lodewijk! :). It’s currently available at his site, and is also available on the EE plugins site.
To use Lodewijk plugin/workaround, simply download and install it, and add…
In my case:
{exp:replace find="/index.php"}{pagination_links}{/exp:replace}
In others, this may be appropriate:
{exp:replace find="index.php/"}{auto_path}{/exp:replace}
... in your pagination links, which solves the problem nicely.
Removing Template Group and index.php from URL
In the case that you use one main Template Group for a site or sub-domain you may want to remove the Template Group and index.php from the URL in order to further simplify the URLs. The setup is very similar but there are some extra steps.
Because this has complex ramifications, removing the Template Group is strongly advised against.
.htaccess
First your .htaccess file is going get more complicated:
RewriteEngine On
#Handle comment redirection
RewriteCond %{THE_REQUEST} !^POST
RewriteRule ^template_group/?(.*)$ /$1 [R=301,L]
#Handle removal of index.php and template group from EE URLs
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/template_group/$1 [L]
Here is what the additions to the .htaccess file are doing line by line.
RewriteEngine on
Turns on the RewriteEngine for apache.
RewriteCond %{THE_REQUEST} !^POST
After submitting a comment EE will redirect users to http://www.yourdomain.com/template_group/comment_template/entry_url since we are removing the Template Group this will not work for us. What the second line does is look in the HTTP request and see what type it is. If it is a POST request (such as the request sent after a comment is submitted) it ignores the third line (allowing the comment to make it to the database) and goes on to the fourth. If it isn’t a POST request then third line goes into action assuring the user gets redirected to the right page.
RewriteRule ^template_group/?(.*)$ /$1 [R=301,L]
This line redirects the user to the correct page minus the template group.
The rest of the .htaccess file is using the file and directory check method which works just like in the normal setup except that you have the template group in the rule as well as the index.php file.
I recently had someone try to do all of this in a subdirectory instead of in the base directory. This confused things with the .htaccess setup a bit. Below is the .htaccess code that ended up solving the issue:
RewriteEngine On
#Handle comment redirection
RewriteCond %{THE_REQUEST} !^POST
RewriteRule ^template_group/?(.*)$ /sub_directory/$1 [R=301,L]
#Handle removal of index.php and template group from EE URLs
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /sub_directory/index.php/template_group/$1 [L]
It is basically all the same except that the sub directory is included in the rules.
Note on robots.txt
You might reflect the changes on your URL structure in your robots.txt file, which is a simple text file at the root of your domain that controls the behavior of search engine robots like Googlebot.
User-Agent: *
Disallow: /index.php/
This code tells searchbots to igore webpages in the ‘directory’ /index.php/. This isn’t a directory at all. It does however contain a duplicate of every file on your website! It is best to avoid duplicate content.
Control Panel Setup
Just like when removing just index.php from the URL you will need to go to Control Panel Home › Admin › System Preferences › General Configuration, and delete the ‘Name of your site’s index page’ value. Leaving it completely blank.
Then you will go to Control Panel Home › Admin › Weblog Administration › Weblog Management here chose the Preferences link for the Weblog that uses the Template Group you are removing. Under Path Settings remove the Template Group from all of the URL’s except the search one.
Caveats
Just as with the previous changes there are some issues. One was already taken care of with the second and third lines of the .htaccess file. The other is that any place in your template where you use a tag that includes “path=template_group/template” you will have the template group in the given URL. In most cases this is easily solved by using one of the tags that will use the Path Settings you removed the template_group from (Such as the {comment_url_title_auto_path} tag. In other cases you will be able to use “path=template” instead of “path=template_group/template”. You may find a few cases where for some reason or other this does not work. And in those cases you can remove the template group using Lodewijk’s plugin.
The final issue is again the Pagination links. This time the Template Group is placed in the URL as well. To remove it use Lodewijk’s plugin like this:
{exp:replace find="template_group/index.php/"}{pagination_links}{/exp:replace}
In the Archive pages (or any other page that does pagination on something besides the main page) the problem is compounded. Not only does index.php and Template Group get placed the URL they are put in a spots that don’t make any sense (The pagination link URL’s will look something like this: http://www.yourdomain.com/index.php/template/template_group/). Because of the way it is placed you need to use a combination of Lodewijk’s plugin and the Replace String plugin by Sacred Smile. Put index.php into the Replace String Plugin’s array and used Lodewijk’s plugin to remove the Template Group. This looked something like this in the template:
{exp:replace find="template_group/"}{exp:replacestring}{pagination_links}{/exp:replacestring}{/exp:replace}
The latest version of the Lodewijk’s Find and Replace plugin supports multiple replacing, so the above result can also be achieved like this:
{exp:replace find="index.php/|template_group/" multiple="yes"}{pagination_links}{/exp:replace}
There is a potential performance hit from all of the added .htaccess rules as well as the added plugin usage. In my case the performance hit was unnoticeable. Your milage may vary.
Added code to take care of EE CSS under some server setups
There are host site ‘fixups’ going on (this one in early May 2008) where Apache or PHP config changes remove server variables that EE would like to depend upon. The latest has removed PATH_INFO completely, which has two effects:
- you need to slightly change how the index.php file removal is done
- you need to make the EE-built CSS links operate again, which come in form:
http://your.web.site/?css=templatename/cssname
as you can see by looking at the emitted page source.
Here then is complete example code for an .htaccess:
RewriteEngine On
RewriteCond %{QUERY_STRING} ^(css=.*)$ [NC]
RewriteRule ^(.*)$ /index.php?/%1 [L]
RewriteCond $1 !^(images|yoursystem|themes|favicon/.ico|robots\.txt|index.php) [NC]
RewriteRule ^(.*) /index.php?/$1 [L]
- first line turns on the Apache rewrite engine
- second line recognizes the css queries
- third line patches the /index.php/ onto the original css query
- fourth line is matcher for the method which recognize folders and filenames which should _not_ be changed. Be sure to have your own system and other folder/filenames.
- fifth line changes any others to be /index.php?/ plus the original path. Note the additional ? placed after index.php in the rule.
The [NC] tags say not to consider letter case; while the [L] tags signify last rule to execute for that cond(ition).
I found it necessary to have the QUERY_STRING case put first, before the lines for index.php removal, and think that is correct logically.
Note on SSL
Post on applying to https
Category:URLs Category:SEO Category:.htaccess
