Avoiding the Duplicate Content Filter on your Drupal Site

Jocelyn's picture
Tags:

One SEO consideration that can be overlooked is having the same content be accessible by multiple urls on your site. This is called duplicate content and your site can be flagged and disappear from Google or other search engines.

If you're using Drupal with Clean URLs and Pathauto, consider the following:

http://www.example.com/blog/2010-03-10
http://www.example.com/node/256

The above urls could be accessed with or without the www prefix & with or without trailing slashes - which means there's now 8 possible urls to access one page, and that could significantly impact your search engine rankings.

It's a matter of perception - Google's perception - whether it believes the intent behind the duplicate content is malicious. But since there's no way of knowing absolutely how your site will be thought of, best to take a few proactive measures.

With or Without www

Starting at line 83 of your Drupal .htaccess file are comments about how to setup your site with or without the www prefix. You really don't need the www in your url, it's a matter of personal preference and your choice may depend on the nature of your site.

With www

RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

Without www

RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

Remove Trailing Slashes

Also in your .htaccess file, you can tell your site to remove trailing slashes.

# Remove Trailing Slashes
RewriteCond %{HTTP_HOST} ^(www.)?example\.com$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Update robots.txt

Tell search engines that if they come across a page like 'node/256' to ignore it by adding a line to the bottom of robots.txt in the root of your Drupal installation.

Disallow: /node/

Aggregating Content from other Sites

If your site is built on aggregating news or other data feeds, to help reduce the risk of being penalized by the duplicate content filter consider also showing unique content. Unique content could be additional site content, comments or generally anything that changes making your pages not appear stale.

Tags:

Add New Comment