Mod_Rewrite Canonicalization Avoids Duplicate Content

Apache’s mod_rewrite has been the bane of many. I think it is mostly the use of regular expressions that is the bane of many instead, but you get the idea.

I want you to take a look at the following URLS:

domain.com
domain.com/
www.domain.com/
www.domain.com
domain.com/index.html
www.domain.com/index.html

Do they all look the same to you? Well they are not the same to big G. In fact, big G may just decide that you have duplicate content because it can see content on domain.com and domain.com/.

Now I don’t want to get into the dub dub dub vs non dub dub dub (also titled The Great Dub Dub Dub Debate by Jeff Atwood).

But what I will tell you is to pick one, and stick to it.

I found a great set of examples and want to pass them along, I have the basic gist of everything listed below, if this link ever goes down, but It is also worth reading the rest of the info on the page as well.

You can find the examples here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
 
### re-direct index.html to root / ###
RewriteCond %{THE_REQUEST} ^.*\/index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ /$1 [R=301,L]
 
### re-direct index.htm to root / ###
RewriteCond %{THE_REQUEST} ^.*\/index\.htm\ HTTP/
RewriteRule ^(.*)index\.htm$ /$1 [R=301,L]
 
### re-direct index.php to root / ###
RewriteCond %{THE_REQUEST} ^.*\/index\.php\ HTTP/
RewriteRule ^(.*)index\.php$ /$1 [R=301,L]
 
### re-direct default.html to root / ###
RewriteCond %{THE_REQUEST} ^.*\/default\.html\ HTTP/
RewriteRule ^(.*)default\.html$ /$1 [R=301,L]
 
### re-direct home.html to root / ###
RewriteCond %{THE_REQUEST} ^.*\/home\.html\ HTTP/
RewriteRule ^(.*)home\.html$ /$1 [R=301,L]
 
### re-direct IP address to www
 
### re-direct non-www to www
 
### re-direct any parked domain to www of main domain
RewriteCond %{http_host} !^www.example.com$ [nc]
RewriteRule ^(.*)$ http://www.example.com/$1 [r=301,nc,L]
####
 
############################ from here ########################################
 
###### This group will not be needed by most websites - try without it first
### add a missing trailing slash to end of domain name or folder name
### ONLY in case the server does not compensate for it
### ONLY use it if needed - othewise remove this block of directives.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.example.com/$1/ [L,R=301]
 
############################# to here ##########################################

You Might Also Like...

blog comments powered by Disqus