Beating the Content Farms Google Can Automate the Like Button

09 Jan 2011 Pete DeLaurentis

End of a Golden Age

When Larry and Sergey created Google, the Internet was suddenly a more useful place. Valuable, previously hidden content was available, and you could get authentic, quality sites on most topics.

Word on the blogosphere is that Google search results aren’t what they used to be:

I’ve found this to be true. When I search, the top results are now often dominated by sites like eHow and about.com. They offer generic, shallow summaries, and take away what I love most about the web – in depth content written by people who truly care.

Show Stuff We Like

Conceptually, the fix is simple: prioritize sites that real people find valuable.

This is what Google has always strived to do. Their method: the more outside links to a page, the more valuable the content.

Unfortunately, content farms can game this cheaply. We need a system that costs more to game than what advertising pays out. To make it expensive for content farms to cheat, you need something that operates at massive scale. One candidate: Google Traffic. No Bot-net or Mechanical Turk army can rival this.

We’re seeing the start of the solution with Blekko’s slashtags, Twitter’s Tweet button, Facebook’s Like button. These are ways that a passionate visitor can explicitly mark a piece of content as valuable. Call it “Manual Like”.

AutoLike

Google has collected an amazing arsenal of engineering talent. AI is one of their core competencies, and they are well equipped to beat this problem in a different way. Let’s call it “AutoLike”.

It’s a way of passively inferring when a user likes a search result, and boosting search rank. In simple terms, Google auto-presses an invisible Like button for this user. Lots of AutoLikes would improve the rank of a search result.

There are two key components to an AutoLike:

1. Track Views

The more people that click on a search link, the higher the ranking. The effect should be magnified the further down the search results you go. For example, if you go all the way to page 2 and click a link, this is more significant than just picking the top result.

2. Track Engagement

This tracking could be built into Chrome, and people will tolerate it if it yields better search results. Engagement can include: time spent on site, the percentage of the page the user has scrolled through, the percentage of videos played.

Tracking views and engagement at scale isn’t trivial, but it is possible. I’ll save this for a future post, but for non-Chrome browsers, there are alternate ways to infer engagement.

The Interest Graph

AutoLike gets stronger when it can be personalized. But how do you know what search results a person will like before you show them? The best personalization is predictive, not reactive. Fortunately, there’s a great resource for this.

Twitter links me with people who share my interests. If someone I follow viewed a website and Google flagged an AutoLike, then that website should rank higher in my search results. For this technique, Google would just need access to follower lists, not tweets.

Prediction: Twitter’s biggest revenue source will someday be a web search engine that’s based on their interest graph.

Fairness to All

AutoLike is fair to content farms. If they create a great piece of content that many people visit and read through, it will rank highly. After all, they offer real information, and while typically low-quality, I’ve seen a few well-written and researched gems.

Let's Fix This!

I’m confident the team at Google is working hard to fix this problem. These guys brought order to the web, and they aren’t going to let it slip away.

I’m rooting for you, Google.