I’ve already received a lot of support from the SEO and other communities in helping me out with TheOpenAlgorithm Project.

As you probably know TheOpenAlgorithm is currently a not for profit research project that I am funding myself. I’m a 16 year old student and as such both experience and cash are not in full supply.

That’s why your help is so important. A bunch of really great companies and individuals have already donated a lot of time, resources and expertise that are absolutely crucial to the upkeep of this project.

Some have donated access to their invaluable data and others their expertise, ideas or opinions. I’ve tried to mention everyone involved with the project here but if I’ve left someone out please let me know.


Rand FishkinSeoMozRand Fiskin and Dr. Matt Peters. Rand contacted me after I posted some thoughts on the initial dataset. He offered me an unbelievable package. As you may be aware SEOMoz have done multiple extremely comprehensive correlation studies before and that’s where I originally got the idea for the use of Spearman’s correlation co-efficient. Their studies are not only valuable in terms of the top notch findings but also because they give me the benchmark from which I can compare my findings to see if they make sense.

Rand hooked me up with Dr. Matt Peters their in house data scientist who ran the correlation studies over at Moz. I chatted to Matt for an hour or so over Skype which was extremely valuable as I could iron out a number of kinks in my methodology, get some great ideas and talk though the process with someone with a lot more experience than myself.

In addition Rand has offered me free access to their incredible link data API. I had already tried the free API and have found it fantastic, so I can’t wait to mine several hard drives worth of data from it :) and potentially finding some strong correlations in what is purported to be the most important area of SEO.

Its amazing to see an industry leader like Rand/Matt/SEOMoz supporting a relatively unknown project and just goes to show the industry’s unique attitude towards collaboration.


Link Research Tools and Christoph Cemper. I interviewed Christoph about link related factors via Skype and during the conversation he mentioned his new API over at Link Research Tools. Its a great product that really gives you some top quality data on links and their related factors/signals. He kindly offered to give me free access to the Superhero account.

To date I have run reports on around 250,000 domains and hope to run a lot more. I haven’t analysed the data yet but a lot of it (link demographics, social media shares,  etc) looks really interesting and hopefully I’ll find some intriguing correlations from it.


Mike Tung and Text extraction is a super-important part of the project. How can you show correlations for on page factors like TF*IDF or keyword usage in image tags or any type of mark-up if you can’t differentiate between what’s part of the content on the page and what’s an ad or a sidebar?

You can’t.

As human’s we do this really well. For example you know that the sidebar to the right of this page is not part of the content and neither are the sharing buttons below this page, but computers aren’t quite so smart.

Mike’s an artificial intelligence and computer science genius who built, a text/media extraction API that does the job that humans do very well.

I tested a number of text extraction APIs and Diffbot was a clear winner. So I emailed Mike and he agreed to give me 1+ million free calls to the API which is enough to test the entire dataset for all the various on-page factors I want to test for.

I’d certainly recommend Diffbot for anyone doing text/HTML/media extraction.

Image credit.


Alchemy API. Alchemy make sense of web pages by applying their various artificial intelligence algorithms to them and parsing data from them or making decisions about them. For example figuring out the language on the page or the general topic of the URL.

Alchemy actively support researchers and not-for-profits so I emailed the support team and got in touch with a really helpful guy, Shaun who set me up on the API with 50,000 calls/day.


Mike King. Mike’s a super-smart dude who stormed onto the SEO scene after years as a programmer. He was just the man I wanted to talk shop to so I hit him up on Skype and we chatted about the project, SEO in general and the computing/programming aspect of the project. Mike’s also been very influential in promoting the project and recommending it on Twitter and was probably responsible for sending Rand over to the site (guess?).


Stephen Young. I posted a request on the Ycombinator’ Hacker News site looking for some Python experts who were willing to help me with some problems and questions I had. Basically I was running before I could walk in terms of taking on the project and not knowing a lot about programming, so I needed a mentor who would be willing to help me out when I came a cropper to a tough programming idiosyncrasy.

A number of people responded but Stephen was the prefect match as Stephen has been building a local SEO tool and programming in Python for a number of years. Someone who knew Python and SEO!

Stephen has been really helpful in aiding me with the project and fixing some of the early bugs I encountered. I’m a lot better programmer now than a couple of months ago and he has a lot to do with that. I plan to continue expanding my programmatic abilities independently but I know Stephen is there if I can’t figure out what to do.


Barry Schwartz. An industry veteran and SEO news guru. I talked with Barry about what he had seen in the SEO news field and what he expected to see coming from Google in 2012. He really knows his stuff and could definitely write a book on the history of the industry.


David DavisRedfly Ltd. When I was looking for some industry experts to talk to the project and parse their brains for juicy ideas I emailed David the MD of Ireland’s leading SEO agency. He was in Brazil at the time but found time to answer my questions by email. Happy to see an Irish SEO on the list.


Kevin Gibbons of SeoOptimise. Kevin was another industry expert who I talked to and sought advice from. He had some top ideas and his firm had already carried out some similar SEO science on social media signals that proved to be a great success.


If you want to get involved in the project leave a comment below or contact me at

Leave a Reply

Your email address will not be published. Required fields are marked *