Links have been an integral part of SEO since Google joined the scene.
But recently link building’s popularity has taken a bit of a hit, with many believing that Google have reduced its weighting of PageRank in the algorithm. The emergence social signals and other factors indicating user satisfaction have according to many within the industry eclipsed (or will in the future will eclipse) links as the primary ranking factor.
But this speculation hasn’t been mirrored in my data. Over the course of this post we will examine over 40 link related factors, all of which correlate very well, and a number of which are the most heavily weighted factors in my study.
The main finding from this data, is how well links correlate to ranking in Google. I have tested over 150 potential ranking factors in 6 categories and without a doubt, links stand head and shoulders above any other section of factors.
Link building is a bit of an ugly duckling within the industry, everybody knows its importance, but very few are effective in its practice.
Unlike changing title tags, building quality links requires skill, creativity and determination. Its not easy work, its not the low hanging fruit, but based on the data below, it appears to be the most rewarding.
While I won’t discuss link building strategies in this post, I would like to mention that I feel many strategies are extremely inefficient and unproductive and a lot of the theory behind this area of SEO is fundamentally flawed. I will be publishing some more of these ideas, with anecdotal evidence in the future.
The below data is based on a dataset of the top 100 results in Google, for 12,573 keywords.
I have analysed this data using Spearman’s Rank Correlation Co-efficient, looking for relationships between individual factors and ranking in Google.
This is all part of a greater project to bring more science to SEO and make it a truly data driven industry.
There are inherent issues with correlations and they don’t prove anything per se, but as I have covered these issues before I won’t rehash old information, what I will suggest is – that if this is your first time on the site, please read this and this.
I would like to thank SEOMoz for providing incredible access to both their amazing Mozscape API, from which the below results are derived and their expertise and advice. In particular I’d like to thank Rand Fishkin, Dr. Matt Peters and the API support team for all their help.
This Excel Spreadsheet provides the keyword by keyword correlation figures from which the above mean correlations are derived.
Google’s algorithm doesn’t just look at how many links there are to a page, it looks at quality signals, website authority indicators and tries to protect against manipulation.
Basically, just building links isn’t good enough, there are certain kinds of links that are better than others.
Below I have covered the types and areas of link building that are thought to be utilised within the algorithm.
The correlations for general links, as compared to specific counts such as # of IPs/Cblocks/Domains/Subdomains are significantly lower.
This supports the fact that Google looks at several factors and classifiers when considering the quality of the source of a link.
While this certainly isn’t an interesting finding, it is important from the point of view, that such a conclusion supports a known fact and therefore increases the likelyhood that the data gathered and the resulting correlations are correct and do represent what’s actually happening within the Google algorithm.
I investigate which particular classifiers and types of links would be best in a link profile, below.
Cblocks and IPs
Both the number of unique Cblocks and IPs linking to a site are thought to indicate the diversity of a link profile.
Google want to see a variety of sites “voting” for a website’s content. The weighting of each additional link from the same site is reduced relative to a link from a new source.
Knowing this many webmasters began to build “lens sites”, that’s sole goal was to link to the mother site.
It is believed, that to counter this Google implemented an algorithm that could figure out if a link was coming from the same source (i.e. the same webmaster) as the site that was being linked to.
There are a number of factors that Google likely use in such an algorithm, but it would make sense that Google treat links coming from the same IP or Cblock as more likely to be coming from the same webmaster, and thus marginally less trustworthy.
While the data doesn’t prove or disprove this theory, it does show a higher level of correlation for the # of Cblocks/IPs linking than for a general count of the # of links to a page/site/subdomain. Although the difference is small it could support the above theory.
With this data and using some common sense, I would recommend following the current industry practice of building a diversified link profile.
Domains and Subdomains
Again the above data further enhances the argument for a diversified link profile.
It also shows a potentially interesting albeit small difference between the # of unique domains vs. subdomains linking. With the # of unique domains coming out on top.
While the difference is too small to make a concrete conclusion, such data would certainly point us in the direction of building links from a diversified set of domains, and treating subdomains on the same root domain as related to each other and therefore each additional link from a separate subdomain on the same root domain as slightly less valuable than the link before it.
Links to the page
The above data conforms to the seemingly obvious conclusion that if you want to get a page to rank well, then building links directly to that page is the best way to get that to happen.
While most SEO’s will find that stupidly basic, I have seen some SEO’s suggesting that domain level links would be more powerful or a better use of time. The data just doesn’t support that strategy if you are trying to increase the ranking of a specific page.
Links to the page’s domain vs. subdomain
Interestingly the strong performance of domains vs. subdomains as the source of a link, is not matched in the location/target of a link. If we are to believe that such marginal differences are important, then the data may suggest (as a number of industry watchers have stated) that Google treat subdomains as separate to the root domain in looking at the host’s (which could be the domain or subdomain) authority.
This seems strange, and I may be reading too much into the data but if the above statement was the case, then Google’s treatment of subdomains as separate sources of content would not be matched by their treatment of subdomains on the same root domain as essentially the same source of links.
If such a conclusion were to be made, then it would be most likely to explained away by the likelihood that Google doesn’t just look at whether its a subdomain or not, and it likely uses much more advanced algorithms to figure out whether a subdomain should be considered part of the same domain.
Thus Google would understand that blogname.wordpress.com is not related to wordpress.com but blog.exampledomain.com is related to exampledomain.com.
Nofollow vs. Followed
Here is a classic case of inter-related factors impacting on the correlations of each other, we know that nofollowed links carry no SEO benefit directly, although they may result in some other factors being impacted e.g. someone clicks on a nofollow link and then shares the page on Twitter.
A page with a lot of nofollow links pointing to it, is far more likely to have a lot of followed links pointing to it.
This is because there are standard ratios, different types of links hold within the link profile. And any deliberate alteration by a webmaster is only likely to result in a small shift in those ratios.
There are many inter-factor relationships going on in the above data. Nofollow links may indeed carry no search engine benefit, but could still show the strong correlations, as above.
Marginal differences in the correlations shown by different categories of links, e.g. followed vs. nofollowed may be more important than it appears as face value.
This is why I have read a lot into such small differences.
SEOMoz have created a number of algorithms that are meant to mimic Google’s link related algorithms. I don’t know the exact make-up of these algorithms, but I thought it would be interesting the test the performance of these algorithms, to check whether using these metrics as a measure of the success of your link building is a good idea.
Wow! Moz really seem to have done a great job developing their algorithms. In keeping with the above data on the value of page level metrics, Page Authority comes out at an astounding .36 correlation, which is massive, making it the highest correlated factor out of the 150+ I have tested.
The link related data is in my opinion is on par with the on page factors as being the most interesting and important to the SEO industry. Both lead to the same conclusion, on page factors are by far less important than off page factors.
Links aren’t just about SEO
Building links isn’t just an exercise in SEO, its also an exercise in marketing. Links can drive a lot of direct traffic from people clicking on them and also can build your brand name.
Its important to factor the direct traffic value of links into your link building decisions. This is particularly evident where a second, third or fourth link from the same site, may seem like a step down in SEO importance but may still provide high value direct traffic.
Links aren’t dead!
If I read another article proclaiming PageRank or link building is dead, I’ll scream. Its very simple, the scientific data simply does not support the speculative accusations of the reduced value of PageRank or link building.
In fact in many cases their level of correlation has increased, not decreased since Moz conducted their 2011 study.
Link related factors are far and above the highest correlated set of factors.
While we in the SEO industry recognise the importance of links, I don’t think we covert this mental idea, into action. I don’t believe that SEOs spend the right proportion of their time on link building. And SEO blogs, conferences and experts certainly don’t talk enough about how to do great link building.
There definitely isn’t enough data available on what the best link building strategies are, with the majority of link related blog posts stemming from speculation, not data driven proof, something I hope to address scientifically through this project.
I welcome presentations like this from Mike King, that back up strategies with solid data.
Bottom line – spend a whole lot more time link building.