Google Measuring Attention Data?


Feedburner published a post research today noting that Google Reader is the clear leader in the feed reader group. WebmasterWorld recently highlighted a new Google patent which relies on user input to demote spam:

The system may aggregate information regarding documents that have been removed by a group of users and assign scores to a set of documents based on the aggregated information.

The WebmasterWorld thread also highlighted that AdWords advertisers and AdSense publishers might highlight some of the bad documents. The patent application also mentions determining which users should be trusted.


Aahh, the DirectHit (hotbot,

Aahh, the DirectHit (hotbot, circa 2000) "bail time" metric gets some new lipstick.

If it was layered...end

If it was layered...end search users, google notebook users, google custom search engines, adwords advertisers, adsense don't think they could get anything useful out of all that RC?

Darn, you spoiled it

I was seeing if Matt would bristle at the "put some lipstick on that pig" reference, Aaron.

Bail time was/is game-able. But even if a little flat-sided, I've always thought it was likely to be a fairly decent metric overall. Aggregated data (site composite profiles might be a more descriptive term since some of it isn't hard data but inferences based upon user action) are going to be pretty close to bullet-proof. They'll just read user self-interest and apply it. Nothing is much more telling than someone's recoil action. Look at adsense publishers' ban lists, perfect example!

But, it will be multi-layered, as you say. You can't triangulate with 2 points of reference. For Google, about the only negative I can think of will be privacy issues ...and JohnQ never takes those seriously.

I wish...

...that you gents that were around in the directhit days could continue to convince the engineers out of using that by reminding them how well you kicked DH's ass back in the day:) I was even convinced for a while from listening to how user data couldn't be incorporated due to the ease of manipulation. Unfortunately I think like you said - now that the data is aggregated, only the real big industrial strength guys will be able to keep the game on, and the engines will find alternatives to filter that. It isn't bullet proof, but it's a nice piece to the puzzle. My gut says it's a decent sized piece to. They just audit the user data the same way they audit links and every other piece of info they are collecting into the borg:)

jerking hotbot's string took

jerking hotbot's string took a little time, todd, and seo's typically hate anything that takes time. but it was one of the first off-site metrics (which caused huge pangs of denial, btw, even if you showed them DH's page that described what they were doing).

>real big industrial strength guys will be able to keep the game on

Maybe, but I can remember when "submit url" servers were industrial strength.


I doubt it was very easy to figure out at first either - hindsight is always 20/20:) Meta tag stuffing woulda been "nice and easy" too if I was smart enough to pick it up 5 years earlier - unfortunately I'm a slow moving dolt:) Certainly some is spending the time - and a lot is owed to knowing when the timing is right.

It seems to me - it now takes YEARS of historical knowledge and information, industrial strength technical prowess AND a Faraday cage level of paranoia to get away with much blatant manipulation for any period of time these days and be above the "stuck in their parent's basement" level of success. I've always camped out on the very light shades of gray side of things, so maybe I'm still kinda naive to think that it's easier to just create some decent stuff and people push a little bit more than button push (or it could be that I'm good at marketing and my technical skills leave a bit to be desired, and I never had those cool tools to push the buttons on:).

I will definitely say that the SMO stuff sure starts to feel a lot more like classic SEO with these kinda filters going more heavily into play. 2.0 has made it a whole lot easier to herd a bunch of sheep into the field, if only for brief spurts.

Damn - One more thing to balance - user traffic data (inflated by those that bail quick due to ADD) with the real visitors that stick around, hunt and peck, and take action:)

That would mean giving up authority

and I doubt Google et al are seriously interested in that.

If you let aggregate user activity drive search traffic, in the event of "unusual circumstances" your QualityTeam have to react to what has already happened, make a decision (gasp!) and then work to correct it. Not a comfortable way to work, being watched, held accountable, etc.... allowing momentum to build while you're back is turned, risking being the fool.

Part of the picture? Sure.. like everything else, to the extent it makes a positive (manageable) contribution. Let's see if we can tell just how much trust is being placed on the data.

You guys sound like you're getting old ;-) The importance of trust is the way it defines the exploit, not the way it retires the old tricks ;-)

2.0 has made it a whole lot

2.0 has made it a whole lot easier to herd a bunch of sheep into the field, if only for brief spurts

Which would likely make it one of the next things to nuke / demote / etc. Just like a sitewide link might only count as one vote, what if they only counted the x most important votes for a page in any given period of time? Then maybe gaming for attention has less value, espcially if doing it over and over again costs the old subscribers...and maybe it counts even less (or as a negative score) if you do get lots of exposure but few subscribers out of it.

allowing momentum to build while you're back is turned, risking being the fool.

Hence the beauty of personalization. If the results are crap it is the users fault for liking garbage.


No question about it. Ever heard of the "q-tip factor?" Church administrators use it when referencing their increasingly white-hair congregations and the corresponding calcification of programs, mission, budgets, etc. Resistance to change is a hallmark of age --except, for some strange reason, in seo/sem, where I often see more younger practitioners sitting glued to their pews. I'm tired now, I need a nap, maybe we'll talk more about this later, John ...but you'll need to remind me, because I forget easily.

Anyway, links (for instance) are becoming cheap commodities, arguably easy to influence with linkbaiting, social media, bots, trackbacks, text ads, yada, yada... Worse, they are cumulative. So give them a weighting factor of .00000001. But a DE-link is a far stronger action, particularly if it's done within a relatively short timeframe from the original link. So give them a weight of 1.0.

passive aggregate data

but driven by self-interest at the individual level

Tagging works well when people tag "their" stuff, but it fails when they're asked to do it to "someone else's" stuff. You can't get your customers to organize your products, unless you give them a very good incentive.

...a statistical comparison between Amazon and LibraryThing tags, and exploring why tagging has turned out relatively poorly for Amazon. I end by making concrete recommendations for ecommerce sites interested in making tagging work.

When tags work and when they don't

picked up from a tag by Fran├žois Nonnenmacher

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.