Basic Drupal SEO: On-site Optimization

NEW! This Drupal SEO tutorial has been updated and rewritten in May 2008.

Drupal is a great open source GPL content management system. With a few modifications it can be configured for excellent on-site search engine optimization. This tutorial only covers the very basics of on-site optimization. It will make sure that search engines are able to spider your site, and prevent some common Drupal SEO errors.

This is just a basic introduction to configuring a Drupal site for good search engine rankings. Other tutorials will go into more depth.

Summary

  1. Enable Clean URLs
  2. Enable Path Module and install and enable Pathauto, Global Redirect and Token Modules.
  3. Configure the Pathauto Module
  4. Install and enable the Meta Tags Module.
  5. Install enable the Page Title Module
  6. Do NOT install the Drupal Sitemap Module.
  7. Fix .htaccess to redirect to "www" or remove the "www" subdomain.
  8. Fix your theme's HTML headers if they aren't right
  9. Recommended: create a custom front page
  10. Modify your robots.txt file.

Drupal and Clean URLs

Enable clean URLs

Search engines prefer clean URLs. In Drupal 6, clean URLs should be automatically enabled if your server allows it. In Drupal 5 you can enable clean URLs under administer —> settings —> Clean URLs. Clean URLs are necessary for the pathauto module, mentioned below.

Drupal Modules for SEO

Install the pathauto module and enable it

The pathauto module is highly recommended. Pathauto will automatically make nice customized URLs based on things like title, taxonomy, content type, and username. You also have to enable the path module for pathauto to work.

Think carefully about how you want your URLs to look. It takes some experience with Drupal to get the exact URL paths that you might want. The URLs are controlled by a combination of taxonomy and pathauto, and I hope to cover that in another tutorial. You can also use the path module to write custom URLs for each page, but that might become tedious and inconsistent on a large site.

At the very least, enable the path module and install the pathauto module. It will generate nice-looking URLs for you without much configuration.

Caution: The above advice is directed towards new Drupal sites. If you have an existing Drupal site be very careful that you don't rename your previously existing URLs with the pathauto module. It is generally a very bad idea to change existing URLs because the search engines will no longer be able to find those pages.

Here are some pathauto settings to watch out for:

For update action choose "Do nothing. Leave the old alias intact." Otherwise the URLs of nodes will change every time you change the title of your post, causing problems with search engines:

Drupal Pathauto update action settings

A more comprehensive Pathauto tutorial is coming soon, but for basic SEO at least make sure that the above setting is correct.

Install the Global Redirect Module

The Global Redirect Module will automatically do 301 redirects to your URL aliases. So if you have a node a example.com/node/5, the Global Redirect Module will redirect that URL to your alias at example.com/my-page.

Install the Meta Tags (Nodewords) Module

The Meta Tags Module (formerly called "Nodewords Module") can be highly beneficial to your site. There is a myth in some search engine optimization circles that says, "meta tags are not important". This is not true.

Meta tags are not meant to be used for keyword stuffing. Don't use them for that purpose because it isn't going to help you. The really important meta tag is the meta description.

The meta description should be different on every page for best results. The meta description should be one or two brief sentences to summarize the page. It should be written for your human visitors, but it is not a bad idea to tastefully and sparingly insert a couple of your keywords. Often when a search engine lists your site in the search engine results pages, it will use your page's HTML title for the title, and your meta description for the text snippet. That is why the meta description should be written with human visitors in mind. You want a text snippet that is going to make them want to click on the link.

Here is one textbook example from this site in the Google SERPs:

Drupal meta description being used as a text snippet in Google

I generally configure the Drupal Nodewords module to output the meta description and meta keywords on every page. I have a few default keywords set, and add a couple more on every post to make a unique combination of relevant keywords. I don't spend much time with it because I don't think the meta keywords are that important.

On the nodewords module's administration page, be sure to check the box that says "Use the teaser of the page if the meta description is not set?". That way each page will get a unique meta description even if you have denied access to create custom meta tags for nodes to some users.

Install the Page Title Module

The Page Title Module allows you to set custom page titles on every page. Highly recommended.

Google Sitemaps Module

Google Sitemaps are not essential, but I've been adding them to my Drupal sites. I think that Google Sitemaps were created by Google primarily for debugging Googlebot and not for the benefit of search engine optimizers.

There is a Drupal Sitemap Module, but the last time I checked it had serious bugs that made it unusable. In any case, I don't think that most Web sites need XML sitemaps. Other SEOs have similar opinions about sitemaps.

I recommend not using the Drupal Sitemaps Module.

Drupal Rewrite Rules

Make sure that your site does a permanent (301) redirect in either of the following two ways:

  • http://example.com to http://www.example.com, or
  • http://www.example.com to http://example.com

You can setup this redirect in your .htaccess file.

To remove the www from your site, look for the following code in your .htaccess file and uncomment and adapt:

  # To redirect all users to access the site WITHOUT the 'www.' prefix,
  # (http://www.example.com/... will be redirected to http://example.com/...)
  # uncomment and adapt the following:
  # RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
  # RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

To redirect to the www version of the site, look for the following code and uncomment and adapt:

  # To redirect all users to access the site WITH the 'www.' prefix,
  # (http://example.com/... will be redirected to http://www.example.com/...)
  # adapt and uncomment the following:
  # RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
  # RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
  

Be sure to replace example.com with your domain name, and then test the redirects in a browser.

Fix Your HTML Headers

There should be one <h1> header element on every page and it should have your keywords in it.

  1. Enclose your site name in DIV tags, not HTML header tags.
  2. I would add one H1 element to the home page.
  3. On teaser views, the node titles should be enclosed in H2 tags, while the main header of the page (e.g., taxonomy term name) should be enclosed in H1 tags.
  4. On node view pages, the node title should be enclosed in H1 tags.

Duplicate Content from /node

By default, the front page of a Drupal site has nearly identical content to the page at /node. Search engines are going to spider and index /node because on the paginated home page view, the link to the first page in the series points at /node.

The fix for this is simple — always use a custom front page when building a Drupal site.

Drupal PHP Session IDs

I haven't seen this problem on Drupal sites in a long time, but if you see PHP session IDs in your URLs, it is very bad for search engines. They have to be removed if you want search engines to be able to spider your site well. A PHP session ID in your URL might look something like this: ?PHPSESSID=37765439acbd6c12345ee987776e65be.

From what I understand, this is the fix if your server supports mod_php — it goes in your .htaccess file:

# Fix PHP session ID problems in Drupal
php_value session.use_trans_sid 0
php_value session.use_only_cookies 1

Otherwise you can probably fix it my modifying your php.ini file (or creating one). I don't know the exact procedure for every host, only that your web site must not have PHP session IDs in the URLs if you want good spidering by search engines. Search Drupal.org or Google for how to turn off PHP session IDs on your server.

Drupal and Robots.txt

The default Drupal robots.txt file has critical errors in it even in Drupal 6.2 (bug report already filed).

Read this Drupal robots.txt tutorial for more information.

Watch out for contributed modules that create duplicate content through extra URLs. This can be a serious problem.

Further Reading

To learn more about search engine optimization, check out the SEO resources page.

Comments

good synopsis of getting started with SEO and Drupal

saw your post on the drupal web site right after mine - this is a great synopsis page. I'm sure you've read the post on my web site about drupal search engine optimization and search engine crawlers. The more people getting the info about Drupal out there the better - it's a great CMS to optimize for the search engines. My only complaint is that you can't use pathauto or a compatible module to automatically generate either tags or taxonomy terms and vocab.

Webmaster Tips's picture

Drupal Search Engine Optimization

Nice tutorial. Drupal is a great CMS. What do you mean by using pathauto to generate tags? Something like freetagging taxonomy?

Great Article

Great article. I installed drupal couple of weeks ago and I was stuck with the SEO issues. Thanks to your article I managed to configure my installation in minutes.
Thanks Again

Mobileguruji.com
GeographyIsHistory.com

Superb link John

Superb link John, For some time now i was looking for comprehensive article on SEO and Drupal as I am still concerned about duplicate content issue in Drupal 4.7 and trying to solve it.

Admin
www.coders2020.com

Drupal SEO

Hi,

Thank you very much for providing useful information about basic procedures to be followed. I love these tips and also I have got more benefits from JTPratt's http://www.smorgasbord.net/how_to_optimize_drupal_web_site_for_google_yahoo_msn_search_crawlers

On the whole I thank both of you people for providing useful information to webmasters in SEO

I found your site looking

I found your site looking for - how to configure Forum in Drupal. You have great articles and guides. The SEO trick is excellent.

duplicate content

I just wrote an article you might find useful, it's called Drupal SEO: How Duplicate Content Hurts Drupal Sites. I go over a few things like using .htaccess and robots.txt to avoid some common duplication issues.

that is a great article

that is a great article about seo...
Great but it did not handled the problem of content duplicating in depth
i think that is the missing piece...
there is a common problem about drupal friendly url...it is all about the trailing slach
take for example
http://yourdomain.tld/articles/drupal-seo
http://yourdomain.tld/articles/drupal-seo/
On a normal Drupal site, with clean URLs enabled, these two addresses are basically interchangeable...to prevent the misrable message of Page not Found
it is fixed at your .htaccess file

#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?yourdomina\.tld$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Cheers

I've heen looking for how to

I've heen looking for how to configure this in 2 days. So it's work for me now. Thanks to this great articles.

content duplication

Content duplication is an issue with most CMS packages including Drupal. When using friendly URLs one easy way is to block the non friendly URLs from the sitemap submitted to google to ensure these do not get indexed. The 301 redirect mod you suggest is a great way to get rid of the / content competing

Friendly URLs, path auto, meta tag module and view together make Drupal an absolutely fantastic offering, with out knowing anything about SEO.

Webmaster Tips's picture

remove trailing slash on Drupal URLs

You can also install the Global Redirect Module. It will remove trailing slashes for you.

I recommend the Global Redirect Module over .htaccess rules for removing trailing slashes, because if you have any non-Drupal directories on your Web site, the .htaccess rules might conflict.

Canonical Domain Issue

Also, in addition to the Global Redirect module, you also want to make sure you 1) resolve any canonical domain name issues (www vs. non-www) in your .htaccess file and 2) set your preferred domain in Google Webmaster Central. Hope that helps!

Syndicate content