Finally Killing Off Keyword Density

As much as any SEO worth their salt knows that keyword density is not a ranking factor, there are some out there that still believe it is a signal in the Google algorithm or somehow is related to ranking well in Google.

Often this myth is perpetrated by the untrained eye, or sold by the snake-oil salesmen looking to oversimplify the Googe algorithm in the hopes of screwing some SEO noobs out of a few bucks.

This short article’s goal is not to shock you with this amazing new revelation but to provide a single scientifically backed piece of proof that keyword density is to be ignored. This article is a handy link for all the SEO consultants who have clients with notions about keyword density and its importance.

Anecdotal evidence

Let’s think about keyword density and Google’s goals logically.

Google’s goal as a search engine is to provide relevant, useful results to users. That’s why users love Google and keep coming back for more and that’s the only reason Google surged to dominance in the search engine field, not marketing budgets, not clever tricks but providing the best results for search queries.

As a user which would you prefer; a page that consistently and methodically mentions the same keyword or a page that uses a similar word of the same meaning to make the writing sound more natural and flow? The second page right?

So why would Google reward a page that is less useful to users than one that is more useful when that runs against their core goal as a search engine?

They wouldn’t.

Scientific evidence

I recently completed a study of over 12,000 keywords. Comparing data on various search engine factors to the ranking of the top 100 results for each keyword in Google. In total I looked at 1.2 million web pages. I used Spearman’s Rank Correlation Coefficient to compare the data on the various factors I tested with each web page’s ranking in Google.

For example I figured out the keyword density for each of the 1.2 million web pages and compared that with each of those page’s ranking.

Spearman (the statistical measure I used) gives you a number between -1 and 1 representing the nature and strength of the relationship between keyword density and ranking well in Google.

A minus number means there is a negative relationship, i.e. when keyword density decreases, ranking in Google increases.

As that number gets closer to either -1 or 1 the strength of the relationship increases. So a number near zero means there is no relationship between keyword density and ranking well in Google.

As it turns out my study showed that the correlation between keyword density and ranking well in Google is -0.028126693, that means there is pretty much no relationship between keyword density and ranking well in Google and if there is a small relationship then it is a negative one.

But what would a number be without a chart? Let’s compare keyword density’s correlation with some other factors I tested:

Chart: Keyword Density ComparedDescription: Tags: Author:

The final blow

There is no better form of proof than having logic and science confirmed by mother Google. Here’s what Matt Cutts, Google’s #1 webmaster spokesman has to say on the matter – “Keyword Density: Not really a factor. Yes the keyword should be present but density is not important. Include the keyword but make writing sound natural.”

If logic, science and Google all say keyword density doesn’t matter, then it doesn’t matter, so don’t believe anybody who tells you it does and stop hounding your SEO guy/gal about it.

9 thoughts on “Finally Killing Off Keyword Density

  1. Ted Ives

    Mark, my initial reaction is – this is a pretty shocking result.

    Also though I’m wondering – it seems like PageRank or Anchor text are essentially unbounded positively correlated variables (i.e. more almost always helps), but what if keyword density is, per keyword, some particular value that you can be either above or below, and either could hurt you.  If that were the case there could be some cancelling out of effects in the data.

    For instance, to rank highly for a particular term, you might need 2% keyword density – let’s say you’re ranking #1 – but #2 has .5% keyword density and #3 is 4% keyword density – sort of a Goldilocks thing, too hot/too cold.

    Then some of the correlation data would sort of cancel out (i.e. for that term you’d have a curve with a hump at 2% – for others you could have a hump at .5%, 1%, etc – but on average if you sum up all the curves you might get a pretty flat looking line).

    Also – did you look at document length – that would be *extremely* interesting.

    Reply
    1. Mark Collier

      Hi Ted

      Very astute and correct point, while I agree that there is a possibility of a cancelling out effect, most likely even if there was an optimum percentage, the dataset would trend towards one or the other side of that optimum number and thus a general correlation for increased keyword density would draw a strong correlation if it was important.

      To discount this possible detraction from the result I will test different bands i.e. 1-2%, 3-4% for correlation in future studies, which I suspect will garner very similar results (except perhaps the higher bands such as 10-15%).

      Thanks

      Mark

      Reply
      1. Ted Ives

        I don’t think aggregating across bands will work, that’s still cutting
        across keywords, rather than fixing the problem within each keyword
        being analyzed before aggregating. 

        I’m thinking more like….for each keyword take the top 5 results, assume they are near optimal, and calculate their median keyword density.  Call it X (average is no good, use median.  Wikipedia etc blow away the averages too often).

        Then for each page ranking for that keyword calculate the absolute value of X minus that page’s keyword density.  Then just do your same analysis you did but using that number (maybe skipping the top 5 as they will now look unfairly good), it’s essentially transforming actual keyword density into the *distance* of the keyword density from ideal, if you believe in an ideal for each keyword.

        Reply
        1. Mark Collier

          Hi Ted

          That’s a really interesting test, well thought out and I’m glad to have such active readers on the blog.

          I’m currently in the process of designing my next study so I’ll certainly consider coding your idea.

          Thanks

          Mark

          Reply
    1. Icel

      Google (and all modern search engines since 2000) has NEVER used Keyword Density as a ranking factor.
      The way Google (allegedly) “understand” the page’s topic is a combination of topic modeling ( http://en.wikipedia.org/wiki/Topic_model ), link signals (such as anchor text) and analysis of user behavior on webpages (i.e. what real people are looking for on a page).

      The amount (“density”) of keywords on a given space, means nothing.

      Reply
      1. bob

        not true. google using most repeated and highlighted keywords, headings, titles to rank site initially. But later it can be changed my offpage seo (aka links). Easiest method to check it – add new site to gwt and see list of recognized keywords.

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>