Basic Drupal SEO: On-site Optimization

Tags:

Summary

  1. Enable Clean URLs
  2. Enable Path Module and install and enable Pathauto, Global Redirect and Token Modules.
  3. Configure the Pathauto Module
  4. Install and enable the Meta Tags Module.
  5. Install enable the Page Title Module
  6. Do NOT install the Drupal Sitemap Module.
  7. Fix .htaccess to redirect to "www" or remove the "www" subdomain.
  8. Fix your theme's HTML headers if they aren't right
  9. Recommended: create a custom front page
  10. Modify your robots.txt file.

Enable clean URLs

Search engines prefer clean URLs. In Drupal 6, clean URLs should be automatically enabled if your server allows it. In Drupal 5 you can enable clean URLs under administer —> settings —> Clean URLs. Clean URLs are necessary for the pathauto module, mentioned below.

Install the pathauto module and enable it

The pathauto module is highly recommended. Pathauto will automatically make nice customized URLs based on things like title, taxonomy, content type, and username. You also have to enable the path module for pathauto to work.

Think carefully about how you want your URLs to look. It takes some experience with Drupal to get the exact URL paths that you might want. The URLs are controlled by a combination of taxonomy and pathauto, and I hope to cover that in another tutorial. You can also use the path module to write custom URLs for each page, but that might become tedious and inconsistent on a large site.

At the very least, enable the path module and install the pathauto module. It will generate nice-looking URLs for you without much configuration.

Caution: The above advice is directed towards new Drupal sites. If you have an existing Drupal site be very careful that you don't rename your previously existing URLs with the pathauto module. It is generally a very bad idea to change existing URLs because the search engines will no longer be able to find those pages.

Here are some pathauto settings to watch out for:

For update action choose "Do nothing. Leave the old alias intact." Otherwise the URLs of nodes will change every time you change the title of your post, causing problems with search engines:

Drupal Pathauto update action settings

There is also a more comprehensive Pathauto tutorial.

Install the Global Redirect Module

The Global Redirect Module will automatically do 301 redirects to your URL aliases. So if you have a node a example.com/node/5, the Global Redirect Module will redirect that URL to your alias at example.com/my-page.

Read more about the Global Redirect Module.

Install the Meta Tags (Nodewords) Module

The Meta Tags Module (formerly called "Nodewords Module") can be highly beneficial to your site. There is a myth in some search engine optimization circles that says, "meta tags are not important". This is not true.

Meta tags are not meant to be used for keyword stuffing. Don't use them for that purpose because it isn't going to help you. The really important meta tag is the meta description.

The meta description should be different on every page for best results. The meta description should be one or two brief sentences to summarize the page. It should be written for your human visitors, but it is not a bad idea to tastefully and sparingly insert a couple of your keywords. Often when a search engine lists your site in the search engine results pages, it will use your page's HTML title for the title, and your meta description for the text snippet. That is why the meta description should be written with human visitors in mind. You want a text snippet that is going to make them want to click on the link.

Here is one textbook example from this site in the Google SERPs with the meta description highlighted in red:

Drupal meta description being used as a text snippet in Google

I generally configure the Drupal Nodewords module to output the meta description and meta keywords on every page. I have a few default keywords set, and add a couple more on every post to make a unique combination of relevant keywords. I don't spend much time with it because I don't think the meta keywords are that important.

On the nodewords module's administration page, be sure to check the box that says "Use the teaser of the page if the meta description is not set?". That way each page will get a unique meta description even if you have denied access to create custom meta tags for nodes to some users.

Install the Page Title Module

The Page Title Module allows you to set custom page titles on every page. Highly recommended.

Google Sitemaps Module

Google Sitemaps are not essential, but I've been adding them to my Drupal sites. I think that Google Sitemaps were created by Google primarily for debugging Googlebot and not for the benefit of search engine optimizers.

There is a Drupal Sitemap Module, but the last time I checked it had serious bugs that made it unusable. In any case, I don't think that most Web sites need XML sitemaps. Other SEOs have similar opinions about sitemaps.

I recommend not using the Drupal Sitemaps Module. [See the comments on this article for a longer discussion about XML sitemaps and Drupal.]

Drupal Rewrite Rules

Make sure that your site does a permanent (301) redirect in either of the following two ways:

  • http://example.com to http://www.example.com, or
  • http://www.example.com to http://example.com

You can setup this redirect in your .htaccess file.

To remove the www from your site, look for the following code in your .htaccess file and uncomment and adapt:

  # To redirect all users to access the site WITHOUT the 'www.' prefix,
  # (http://www.example.com/... will be redirected to http://example.com/...)
  # uncomment and adapt the following:
  # RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
  # RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

To redirect to the www version of the site, look for the following code and uncomment and adapt:

  # To redirect all users to access the site WITH the 'www.' prefix,
  # (http://example.com/... will be redirected to http://www.example.com/...)
  # adapt and uncomment the following:
  # RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
  # RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
  

Be sure to replace example.com with your domain name, and then test the redirects in a browser.

Fix Your HTML Headers

There should be one <h1> header element on every page and it should have your keywords in it.

  1. Enclose your site name in DIV tags, not HTML header tags.
  2. I would add one H1 element to the home page.
  3. On teaser views, the node titles should be enclosed in H2 tags, while the main header of the page (e.g., taxonomy term name) should be enclosed in H1 tags.
  4. On node view pages, the node title should be enclosed in H1 tags.

Duplicate Content from /node

By default, the front page of a Drupal site has nearly identical content to the page at /node. Search engines are going to spider and index /node because on the paginated home page view, the link to the first page in the series points at /node.

The fix for this is simple — always use a custom front page when building a Drupal site.

Drupal PHP Session IDs

I haven't seen this problem on Drupal sites in a long time, but if you see PHP session IDs in your URLs, it is very bad for search engines. They have to be removed if you want search engines to be able to spider your site well. A PHP session ID in your URL might look something like this: ?PHPSESSID=37765439acbd6c12345ee987776e65be.

From what I understand, this is the fix if your server supports mod_php — it goes in your .htaccess file:

# Fix PHP session ID problems in Drupal
php_value session.use_trans_sid 0
php_value session.use_only_cookies 1

Otherwise you can probably fix it my modifying your php.ini file (or creating one). I don't know the exact procedure for every host, only that your web site must not have PHP session IDs in the URLs if you want good spidering by search engines. Search Drupal.org or Google for how to turn off PHP session IDs on your server.

Drupal and Robots.txt

The default Drupal robots.txt file has critical errors in it even in Drupal 6.2 (bug report already filed).

Read this Drupal robots.txt tutorial for more information.

Watch out for contributed modules that create duplicate content through extra URLs. This can be a serious problem.

Further Reading

To learn more about search engine optimization, check out the SEO resources page.

Buy the Drupal 6 SEO Book.

Visit Drupal SEO company, Volacci.

Average: 4.3 (27 votes)

Comments

good synopsis of getting started with SEO and Drupal

saw your post on the drupal web site right after mine - this is a great synopsis page. I'm sure you've read the post on my web site about drupal search engine optimization and search engine crawlers. The more people getting the info about Drupal out there the better - it's a great CMS to optimize for the search engines. My only complaint is that you can't use pathauto or a compatible module to automatically generate either tags or taxonomy terms and vocab.

Webmaster Tips's picture

Drupal Search Engine Optimization

Nice tutorial. Drupal is a great CMS. What do you mean by using pathauto to generate tags? Something like freetagging taxonomy?

Great Article

Great article. I installed drupal couple of weeks ago and I was stuck with the SEO issues. Thanks to your article I managed to configure my installation in minutes.
Thanks Again

Mobileguruji.com
GeographyIsHistory.com

Superb link John

Superb link John, For some time now i was looking for comprehensive article on SEO and Drupal as I am still concerned about duplicate content issue in Drupal 4.7 and trying to solve it.

Admin
www.coders2020.com

Drupal SEO

Hi,

Thank you very much for providing useful information about basic procedures to be followed. I love these tips and also I have got more benefits from JTPratt's http://www.smorgasbord.net/how_to_optimize_drupal_web_site_for_google_yahoo_msn_search_crawlers

On the whole I thank both of you people for providing useful information to webmasters in SEO

I found your site looking

I found your site looking for - how to configure Forum in Drupal. You have great articles and guides. The SEO trick is excellent.

duplicate content

I just wrote an article you might find useful, it's called Drupal SEO: How Duplicate Content Hurts Drupal Sites. I go over a few things like using .htaccess and robots.txt to avoid some common duplication issues.

that is a great article

that is a great article about seo...
Great but it did not handled the problem of content duplicating in depth
i think that is the missing piece...
there is a common problem about drupal friendly url...it is all about the trailing slach
take for example
http://yourdomain.tld/articles/drupal-seo
http://yourdomain.tld/articles/drupal-seo/
On a normal Drupal site, with clean URLs enabled, these two addresses are basically interchangeable...to prevent the misrable message of Page not Found
it is fixed at your .htaccess file

#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?yourdomina\.tld$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Cheers

I've heen looking for how to

I've heen looking for how to configure this in 2 days. So it's work for me now. Thanks to this great articles.

content duplication

Content duplication is an issue with most CMS packages including Drupal. When using friendly URLs one easy way is to block the non friendly URLs from the sitemap submitted to google to ensure these do not get indexed. The 301 redirect mod you suggest is a great way to get rid of the / content competing

Friendly URLs, path auto, meta tag module and view together make Drupal an absolutely fantastic offering, with out knowing anything about SEO.

Webmaster Tips's picture

remove trailing slash on Drupal URLs

You can also install the Global Redirect Module. It will remove trailing slashes for you.

I recommend the Global Redirect Module over .htaccess rules for removing trailing slashes, because if you have any non-Drupal directories on your Web site, the .htaccess rules might conflict.

Canonical Domain Issue

Also, in addition to the Global Redirect module, you also want to make sure you 1) resolve any canonical domain name issues (www vs. non-www) in your .htaccess file and 2) set your preferred domain in Google Webmaster Central. Hope that helps!

MikeyLikesIt's picture

xml sitemap should be used in some cases

http://drupal.org/project/xmlsitemap

I just wanted to point out that the idea of NOT including/submitting an XML sitemap to google really only applies in cases where someone is willing/able to dedicate time to checking and optimizing their SEO ratings (the sitemap submissions give "unnatural" results which can hide more important problems that SEO experts would otherwise be able to detect). Not all sites have the budget/resources/knowledge to do this. So ... if you know that this isn't going to be done, it is still advisable to submit an XML sitemap to the search engines so that the pages get indexed.

Webmaster Tips's picture

Drupal XML Sitemaps

I disagree. I think the opposite is true -- it's only worth submitting an XML sitemap if you have an enterprise-size site and really know what you are doing and want to take a lot of time making sure everything is correct. IMO, if you're talking about a site with fewer than tens of thousands of pages, it's a waste of time. And if you have that many pages, you're still going to need a lot of inbound links to get them to rank anyway.

It's good to signup for Google's Webmaster Tools and verify, but I don't think it's worth the time to create and submit an XML sitemap.

The Drupal XML sitemap module has been broken for at least a year, which is another reason not to use it.

XML sitemaps don't help with rankings at all. It's only for getting pages indexed.

If you want all the pages to get indexed, create a good site architecture. It's easy with Drupal. You can either make an HTML sitemap, and or use taxonomy or views to create pages with links based on keyword. Then get inbound links to those category pages in order to distribute the PR/juice.

If you want new content to get instantly indexed, send your RSS feed through Feedburner and have Feedburner ping Google Blogsearch.

MikeyLikesIt's picture

Interesting ...

I guess I'll have to eat my words. I definitely think that site architecture should come first. Doing things to ensure that you content is properly linked together in a logical manner like setting up taxonomy and tagging with appropriate keywords will make sure that there are links to your content so that crawlers can find all your pages to index them. I was under the impression that the sitemap submission would give you a small boost in ranking since there are weights that can be set in the sitemap, but I'll defer to the experts if you say this isn't so.

Thanks, by the way for a great collection of SEO tips, and for the speedy response.

Webmaster Tips's picture

XML Sitemaps

The "priority" element in the sitemap allows you to set relative importance of pages within the site. It doesn't boost your rankings compared to other sites.

Here is the specification:

The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawlers.

The default priority of a page is 0.5.

Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages. Search engines may use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your most important pages are present in a search index.

Also, please note that assigning a high priority to all of the URLs on your site is not likely to help you. Since the priority is relative, it is only used to select between URLs on your site.

You can also let search engines know what pages are important without an XML sitemap by linking to a page with internal and external links. The Google Webmaster Tools let you view how many internal pages link to each page on your site.

I wouldn't install a stock XML sitemap generator plugin on a small- or medium-sized site. I think Google's ability to find a site's content is more sophisticated than the average automated sitemap generator.

I think that XML sitemaps might be a good idea on really large websites that aren't getting all of their pages indexed. In that case, I don't think a one-size-fits-all sitemap generator is going to do the trick either.

Just my opinions... :)

EDIT: see also this video with Google's Vanessa Fox:

"It’s really not about the ranking; it’s more about crawling… Sitemaps doesn’t impact your ranking at all.

The only way it impacts ranking is that in it helps in that very first obstacle of learning about all your pages because if we don’t about them we won’t index them and we won’t rank them. But other than that it has no impact on ranking."

custom front page

Hello everyone!

I'm working in a Drupal website and nowadays I'm working with the SEO issues. This tutorial is perfect, and very clear, but I have one doubt:

Maybe this is a stupid question, but, talking about "Duplicate Content from /node", what does it mean "always use a custom front page"?

I've developed a front page called "inicio". In the "Administer/Site configuration/Site information" page, I've set the drupal from page to: www.mysite.com/inicio.

So, when the URL is "www.mysite.com", it shows my front page, and when the URL is "www.mysite.com/inicio", it shows the same page. This may be interpreted as duplicated content, am I wrong?.

Reading that topic, to solve this problem I have to use a custom front page, but I don't know what it means. Must I create a index.html page or something like that? Could be made my custom front page with Drupal? If I'm not wrong, the "drupal from page" parameter is mandatory so, how could I solve this problem?

Any help would be very appreciated.

Potter

Webmaster Tips's picture

Global Redirect Module

The page you created, www.example.com/inicio, is a custom front page.

If you install the Global Redirect module it will automatically redirect www.example.com/inicio to www.example.com/.

Also, when you link to your home page, link to www.example.com/, not www.example.com/inicio.

That should solve the issue...

Perfect! Thank you very much

Perfect!

Thank you very much for the answer!

Potter.

Seo problem with duplicated content in Drupal URL

Hi to all

I have a curious problem with a site built with Drupal 6 and well optimized for SEO.

Last week I found that Google send me a warnong for "duplicated contents" in some page of my site.

Here the surprise, 3 pages involved:

The first is the homepage (http://mysite.it) and the other pages are:

http://mysite.it/index.php?id=19
http://mysite.it/index.php?id=29

but this last two don't exist and are redirected in homepage generating the problem above.

Some Information:

- URL rewrite is Ok
- Httaccess has no problem and fully configurable
- Page 403 and 404 were been correctly builted
- site work on a Debian-etc server

Can someone help me solving this problem?

Webmaster Tips's picture

SEO

Just add this to your robots.txt and it should stop Google from crawling them:

Disallow: /*id=

Did you have a different site there before that was then converted to Drupal? If your old site had URLs like