Google Link: Command – Busting the Myths

Posted by randfish

I’m a big Google fan – my wife often sleeps in their t-shirts, I speak on panels with Googlers all the time and I’ve even got a Google water bottle for working out (which happens all of once a month these days). However, I am NOT a fan of the Google link command, and I’m shocked by the number of folks who operate in and around the SEO, webdev and technology industries who haven’t realized this.

Here’s what Google themselves have to say on the matter:

You can perform a Google search using the link: operator to find a sampling of links to any site. For instance, [link:www.google.com] will list web pages that have links pointing to the Google home page. Note there can be no space between the “link:” and the web page URL.

To see a much larger sampling of links to any verified site in Webmaster Tools:

  1. On the Webmaster Tools Home page, click the site you want.
  2. Under Your site on the web, click Links to your site.

Note: Not all links to your site may be listed. This is normal.

Here’s what Matt Cutts (head of Google’s Webspam team) had to say in a video on the subject:

The short answer is that historically, we only had room for a very small percentage of backlinks because web search was the main part and we didn’t have a ton of servers for link colon queries and so, we have doubled or increased the amount of backlinks that we show over time for link colon, but it is still a sub-sample. It’s a relatively small percentage. And I think that that’s a pretty good balance, because if you just automatically show a ton of backlinks for any website then spammers or competitors can use that to try to reverse engineer someone’s rankings.

Google themselves is telling us not to pay too much attention to the link command, but that doesn’t seem to be stopping folks. Let the myth busting commence.

Myth #1 – The Google Link Command Returns Accurate Numbers

Nope. Not even close. Google themselves say the numbers aren’t accurate and that they’re showing a small sub-sample. The numbers show this as well. Check your link counts with the Google link command vs. the number inside Google’s Webmaster Tools (when you verify your account, you’ll see them shown). Here’s the stats for SEOmoz, for example:

Google's link command for SEOmoz

Google’s link command claims 1,590 links. Let’s see what Webmaster Tools says:

Google's Webmaster Tools Link Count for SEOmoz

Hmm… 381,403 seems slightly larger than 1,590. In fact, the link command is showing me 0.4% of what Webmaster Tools says exists. Running this analysis on another few domains that we have access to in Webmaster Tools, I saw numbers ranging from 0.1% to 4.4% (meaning there’s not even any consistency between in the percentage of links from the two counts). 

Myth #2 – The Google Link Command Returns Important Links

Tragically, a long time ago (pre-2004), Google did show only important links via the link: command, which created the myth that exists to this day. In fact, the links shown in the link: command have no particular importance or relevance. They are truly a random sample, including links that are nofollowed, links from pages that have had PageRank penalties applied to them as well as links that do pass link juice and value.

Myth #3 – The Google Link Command Returns Links in Some Kind of Order

No one in SEO has been able to show any ordering of any kind in the Google link: command’s results. Important, well-known websites may be listed on page 2 or page 20 of the results, and it is likewise with spam, scrapers and low quality sites that Google’s likely not counting. In Site Explorer and the web results, Yahoo! appears to do some type of ordering, tending to show more important links, pages and sites before less important ones (though not with great consistency). Unfortunately, many SEOs suspect that, should Microsoft’s deal to power Yahoo! with Bing results go through, Yahoo! is unlikely to maintain their own web index (and thus, link, linkdomain and site explorer will be gone).

Google's Link Command Results for Yahoo.com

As exemplified above, Google appears to be very random indeed when showing link: results.

Myth #4 – The Google Link Command Returns a Numerically Representative Count of Links

This is possibly the myth that’s most disturbing of all, primarily because so many operators in the SEO field belive it and track the link: command count as a reliable, useful metric. Nothing could be further from the truth – and here’s some data to help back it up:

Root Domain

Google Link: #
(external + internal?)

Yahoo! Linkdomain #
(external only)

Linkscape Count
(external only)

Yahoo.com 3,650 331,000,000 201,681,667
Recovery.gov 7,550 328,000 155,780
Facebook.com 165,000 567,000,000 116,748,934
Real.com 11,400 4,600,000 5,596,165
Adobe.com 51,200 124,000,000 78,550,468
Reddit.com 18,300 128,000,000 29,071,291
Twitter.com 224,000 515,000,000 132,528,763
Salon.com 12,300 3,420,000 1,535,342
SEOmoz.org 1,590 957,000 486,405
NYTimes.com 7,990 21,200,000 12,884,758
TurkeyDayRun.com 3 68 22
Ninme.com 539  42,000 3,149
Burgerking.com 942  106,000 23,761
Alaskaair.com 1,010 44,000 38,358
Smashingmagazine.com 8,730 1,130,000 592,054
Smithsonian.org 4,860 25,700 14,545

I collected the data above spur of the moment, so I won’t try to claim great statistical integrity. However, looking at Google’s link: command results, the best I can say is that Google has some relationship to the others within 1-2 orders of magnitude, though they may be directionally inaccurate much of the time as well. Just look at the NYTimes.com for example – Google claims they have 2/3rds the links that Salon.com has, yet Yahoo! and Linkscape agree that, in fact, NYTimes.com has 6X+ Salon.com’s link total.

These are not numbers you want to hang your hat (or any crucial business decisions) on.

Myth #5 – The Google Link Command Tracks Accurately Over Time

Unfortunately, I don’t have data points I can show, but our observations over time indicate that Google’s link count in Webmaster Tools might rise, along with the Yahoo! and Linkscape link counts, yet the Google link: command will show lower numbers. The reverse is sometimes also the case. Without directional consistency, even when compared against their own counts, it’s very hard to take the Google link: count seriously.

Myth #6 – The Google Link Command is Up to Date

Most SEOs & webmasters have noticed that the Google link: counts update infrequently, inconsistently and most often in correlation with toolbar PageRank updates (another data point I’ll need to takcle in a future post). These updates from Google occur every 2-10 months with little warning about when they’re coming or have happened. If you watch sites like closely, they’ll report many of these as they occur.


The next time someone tells you their Google link: command numbers as a metric for SEO, competitive analysis or anything else, make sure they read this post. Google’s not nearly as up-front with the information as they should be (honestly, removing the link command would save so much time and effort for poor site owners who get needlessly confused), but hopefully as a community, we can help build more awareness around this issue.

Reblog this post [with Zemanta]
VN:F [1.9.3_1094]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.3_1094]
Rating: 0 (from 0 votes)

Link Building Has Changed

Posted by randfish

When I first started in SEO, link acquisition was almost always a manual process. I’d search the engines for links that pointed to the competition, find relevant directories and link lists, email relevant sites and beg, borrow or bribe (aka buy advertising) to get a link. I tried reciprocal link building (and did some pretty dumb stuff). Then, as I got more intertwined in the SEO community, I found vendors who built large networks of sites, spammed blogs/forums/guestbooks and ran text link sales operations. I leveraged these services to help clients rank better, almost always with great success. Then I met Matt Cutts, found out more about Google’s webspam team, saw penalties and their impact (remember Florida?) and even found some sites we worked on in the Sandbox.

Over time, I got smarter. I read papers about HilltopTrustrank, Anti-Trustrank and many more. I saw sites escaping the sandbox once they’d earned greater quantities of trusted links. I started understanding that Google’s search quality team was only going to get better at recognizing and counting legitimate links (and tossing out the junk), so I focused exclusively on more “white hat” kinds of links. That’s when I discovered linkbaiting and the power of Digg, Reddit & StumbleUpon to drive traffic that would naturally link. We had success with quizzes (and after Matt left SEOmoz, he had a little too much success) and viral content that earned thousands of links overnight and started offering it as a service.

As our clientele and foci changed, we changed again. Linkbait gave way to broader viral marketing efforts. Social media marketing arose as a practical and high quality way to earn links. Our clients became larger brands and organizations and one-off link projects weren’t scalable, so we consulted on tactics like content and technology licensing, training editorial staff to earn links & participate in the social media world themselves, and incentivizing user-generated content, which in turn brought links from those users. We found ways to drive natural links to deep pages on huge sites targeting the long tail, how to combine embeddable content and user-adopted brand affinity to drive link growth. And we stopped buying links entirely.

I figured a visual history might make for a compelling view:

A History of Link Building Tactics

Now, link building is changing again. I’m of the distinct impression that the engines (nowadays referring to Bing & Google, since the others are all but out of the picture) are evolving to keep up with the web’s breakneck speed and new forms of data, along with new ways of analyzing links, are making themselves felt in the SERPs. My guesses/observations would include:

  • Twitter really is cannibalizing the web’s link graph, or at least, the blogosphere’s and Google seems to be using Tweet counts in some way (though possibly only in the QDF algo).
  • The acceleration rate of link acquisition and the freshness of new links is having a more dramatic impact than before, and the “old crusty links” paradigm may be fading a bit.
  • Brand mentions and keyword associations with brand names are influencing the rankings more and more.
  • Un-trustworhty link patterns are conferring more filters and penalties than ever before.
  • QDD is as strong as ever, and vertical results are more prominent than at any time in the engines’ histories.
  • Google and Microsoft both know more about traffic and surfing habits than ever before, and this data is likely being used to, at the least, quality control for potential algorithmic misses.
  • Ad blindness is worse than ever (16% of Internet users are responsible for 85% of all ad clicks on the web), forcing the engines to make ads more relevant and more obvious to continue earning revenue.
  • Paid inclusion is going away, and talk of potentially paying sites to be in the indices (the reverse model) is in the air (or maybe not).
  • Billions of non-linked “references” flow out across the web through social media messages, emails, tweets and IMs. Someone, at some search engine, is undoubetdly mining this data to see how they can derive value and relevancy from it.

As marketers, we have to evolve or be left behind by those who can better adapt. It’s hard to see the forest for the trees right now, but I think we’re closing in on a time when real-time, social and traditional web references are all a part of the rankings equation. The future may be less about links and more about brand building and brand participation. I don’t want to be the most-linked-to site in my niche; I want to be the site that’s synonymous with my niche.

Now we just have to figure out the tactics…

Reblog this post [with Zemanta]
VN:F [1.9.3_1094]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.3_1094]
Rating: 0 (from 0 votes)

New & Interesting Insights Into Google Rankings & Spam from Pubcon

Posted by randfish

Tonight’s post comes via the Pubcon conference in Las Vegas and is likely of interest to many in the webmaster and search communities. Today, during the Interactive Site Review Session, Google’s head of Web Spam, Matt Cutts, along with Vanessa Fox of NinebyBlue and Derrick Wheeler of Microsoft took thorough dives into a number of sites. The session was well coverd on Twitter, and in live form by Barry Schwartz at SERoundtable.

Google's Matt Cutts and Vanessa Fox of NinebyBlue on the Site Review Panel
Matt Cutts and Vanessa Fox on the Site Review Panel (photo credit: davecolorado.com)

A few points in particular stood out and are worthy of coverage:

  • Blocking Internet Archive may be a Negative Signal
    Matt Cutts noted that spammers very frequently block archive.org from crawling/storing their pages and few reputable sites engage in this. Thus, it’s a potential spam signal to search engines. SEO Theory has a good writeup on when and why there may be legitimate reasons to do this, but webmasters seeking to avoid scrutiny may want to take heed.
    _
  • Web Page Load Time can Positively Influence Rankings
    Maile Ohye actually mentioned this at SMX East in New York, but Matt Cutts repeated it again today. In a nutshell – while slow page load times won’t negatively impact your rankings, fast load times may have a positive effect. This comes on a day when the Google Chrome blog introduced their new SPDY research project. I’m particularly happy about this news, because it’s also true that load times have a positive second-order effect on SEO. Pingomatic recently published some excellent research on load times from Akamai noting the expectations of users for faster web browsing have doubled in the past 2 years. In addition, fast loading pages are, in my opinion, considerably more likely to earn links, retweets and other forms of sharing than their slow-loading peers. This tool from Pingdom is a great place to start testing your own site.
    _
  • It May be Easier to Walk Away from Banned Domains
    Sites that Google’s webspam team has severely penalized or banned entirely from the index can be very difficult to re-include, and thus, Matt suggested that “walking away” and “starting over” may be a more prudent strategy. In my opinion, this is largely due to link profile issues – if your site has a “spammy” link profile, it’s tough to ask an engineer to sort out the wheat from the chaffe manually (or algorithmically) and stop counting only the bad links. Thus, re-consideration requests may not be as effective a use of time as registering a new site and trying to re-build a more trusted presence.
    _
  • Repetition of Keywords in Internal Anchor Text (particularly in footers) is Troubling
    During a specific site’s review, Matt noted that keyword usage in the anchor text of many internal links, particularly in the footer of a website, is seen as potentially manipulative. Yahoo!’s search engineers have noted this in the past and we at SEOmoz have seen specific cases where removal of keyword-stuffed internal links from a footer had immediate impacts on Google rankings (removing what appeared to be large negative ranking penalties sitewide).
    _
  • Having Multiple Sites Targeting Subsections of the Same Niche can be Indicative of Spam
    Matt Cutts today mentioned that “having multiple sites for different areas of the same industry can be a red flag to Google.” Though Googlers have mentioned this before, today’s site review panel brought renewed attention to both Google’s ability and proclivity for carefully considering not only an individual site, but all the other sites owned by that registrant/entity/person. Given Google’s tremendous amount of data on web usage behavior, many SEOs suspect that they track beyond simply domain registration records.

I also presented at Pubcon today – on a panel called Linkfluence: How to Buy Links with Maximum Juice and Minimum Risk (live SERoundtable coverage here) - as the counterpoint speaker (on why not to buy links). I’ll try to have that presentation in written format early next week on the blog.

p.s. I was asked by a large number of attendees at the conference about our venture capital fundraising experience. I expect to be able to write about that very soon and certainly appreciate all the support. :-)

p.p.s. For those who are interested, my brother, Evan Fishkin (who works at Portent Interactive) had his head shaved by Google’s webspam chief. On a personal note, I must say I was particularly impressed with Matt’s ability to shave a head without nicks or cuts, and his foresight in bringing proper equipment. Unfortunately, I’m not fully briefed on why this occurred, but I do know that my little brother was in terrible need of a trim (photo of my shocked observance of the event here & more photos/video here).

Reblog this post [with Zemanta]
VN:F [1.9.3_1094]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.3_1094]
Rating: 0 (from 0 votes)

This Week in Search for 9/3/09

Posted by Sam Niccolls

Henry Ford said that if he had asked his customers what they wanted before he built the Model T, they would have said they wanted a faster horse. The reality is few people are industry leaders. Most try to make faster horses. But had the need in 1907 been for web analytics and not transportation, Avinash Kaushik would have been a likely candidate to put America on wheels. So prompted by something Avinash said in a blog post earlier this week, here are some analytics quotes that’ll get the ole pistons firing.

Avinash Kaushik Quotes that Belong on Hallmark Cards:

7. Not segmenting data is a crime against humanity.

6. Never let your campaigns write checks that your website can’t cash.

5. I believe God created the internet so we could fail faster.

4. Magazine advertisements are faith based initiatives.

3. All data needs context, even server errors go up and to the right over time.

2. Bounce rate is brilliantly dumb. It shows that your customers came, they puked, they left.

1. Social media is like teen sex, everyone wants to do it. No one actually knows how. When finally done, there is surprise it’s not better.

Five Thumbs

  • Tips for eCommerce websites: From site architecture to product interlinking, Everett Sizemore’s post is a shopping cart of tips and tricks that eCommerce marketers shouldn’t abandon.

Four Thumbs

  • Seth Godin: comparing your business to the status quo: The best airport restaurants are those that compare themselves to other restaurants, not other airports. The lesson here is simple yet poignant: if your customers crave better food, don’t be afraid to go gourmet, even if your competition is happy serving McLeftovers.
  • Robots exclusion protocol tutorial: For those looking to learn more about robots files, Bing’s post on robots exclusion protocols is informative, as well as digestible for non-technical folks.

Three Thumbs

  • Submitting a Sitemap on Bing: In June Bing rolled out Sitemap.xml support, but if you’re looking to create, submit, or validate a Sitemap with Bing, the post on their webmaster blog is a great tutorial.

Two Thumbs

Rocking on YOUmoz

From actionable seminar re-caps to ethical SEO debates, there were many fantastic YOUmoz posts this week. Apparently dummerboy9000 was not the only mozzer who read Jen’s post about creating great UGC blog posts and noticed that YOUmoz links are followed.

  1. SEOmoz Training Seminar Takeaways by Whitespark
  2. How I got 200 Backlinks for Free by Trafikant
  3. Extreme Local Optimization Put to the Test by Rstellers
  4. Standing Out in the Crowd by Sly-grr
  5. The Ethics of Search Engine Marketing by Thomas M. Schmitz
  6. My Twitter Experiment at SEOmoz Training by KitsapKing
  7. Immersion at the SEOmoz Day Spa by erikellsworth

Reblog this post [with Zemanta]
VN:F [1.9.3_1094]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.3_1094]
Rating: 0 (from 0 votes)