Is Google Putting Scrapers on Notice

21 comments

A recent post over on the webmaster central blog seems to be hinting Google is getting ready to start doing something about scraping

Google is willing to take action against domains that try to rank more highly by just showing scraped or other autogenerated pages that don't add any value to users. Companies, webmasters, and domain owners who consider SEO consultation should take care not to spend time on methods which will not have worthwhile long-term results.

not sure why Polish is specifically being mentioned there ...

Comments

If Wikipedia content adds no

If Wikipedia content adds no value to my site then why the hell does Google think it adds so much value to theirs?

Does Google think open source content projects are spam? If not, why are they talking about them as such, while ranking them so well?

I wonder how long Answers.com has left.

..what do they mean start doing something

Google has hated scrapers for the past two years and has done everything in their power to remove them from the SERPs.

Google's blatant hypocrisy

Another funny thing about adding other's content to a site adding no value. Feel free to justify the YouTube purchase if adding other's content to your site adds no value Google.

A liar and a thief... and a good corporate citizen out for their user's best interests. garbage.

Wow, I'm lost. Google's

Wow, I'm lost.

Google's whole search business is scraping.

Google's whole search business is scraping.

Quote:
Google's whole search business is scraping.

I have to admit that is the truth.

Yeah - we could open up that debate on..

..where snippets end and where scraping content begins, couldn't we!!

It's really a question of quality

There are several good sites that do a great deal of scraping or quoting source content, but we all know about the auto-generated crap sites. The auto-generated sites are greatly the result of Google's own actions, so it makes sense they would want to perform some sort of algorithmic correction.

It is fairly rare in my own searching to find the crappy scraper sites at the top of the results, but there are occasions that they do. Even if they aren't ranking the sites, they still maintain a modicum of linking power, and I'm sure that's the area that remains to be addressed.

Maybe this is a different

Maybe this is a different thread -- but -- Y and M don't spend have the academic brain power Google does to tweak its algorithm from producing spam, yet somehow their results are relevant, sometimes much more so than Google's. Maybe Google is chasing it's incredibly expensive tail.

Poland

Home to some of the finest SEO spam. I am sure Google can employ some good spammers there to help them in their anti-spam quest.

More importantly though, is there anything Google can do about it? Without human reviews how does Google intend to catch these sites?

Without human reviews how

Quote:
Without human reviews how does Google intend to catch these sites?

Artificial humans, aka artificial intelligence.

>>Artificial humans, aka

>>Artificial humans, aka artificial intelligence.

ROFL

Sorry but if you knew what I did you'd be laughing too.

Sorry but if you knew what I

Quote:
Sorry but if you knew what I did you'd be laughing too.

artificial brains?
animatronic disney george bush models?
saturday afternoon clown?

hehe, so they will be

hehe, so they will be banning the big names like tripadvisor.com for pages like "be first to review!" that get cached and link to more important pages.

i won't believe until i see it.

artificial

Quote:
artificial brains?
animatronic disney george bush models?
saturday afternoon clown?

I don't want the black helicopters to circle above my place so I'd rather not say :)

Come on Aaron

That's not what Google meant by calling out the Wiki.

They're complaining about people that attempt to rank by using Wiki's content, which is dupe content when used anywhere other than in the Wiki which obviously puts contributors that use their own data elsewhere in a bit of a bind in Google's eyes.

However, how is that any worse than using articles from those damn article farms?

Either scraped dupe content is scraped dupe content or it isn't, you can't single out just Wiki content but it appears they are because it's probably easier to identify.

google is unable choose the

google is unable choose the content author. put dupe content on trusted domain and see it not going into supp index.

so they must start their PR. just like they do with link buying.
if they punished for buying links all my competition would be dead.
same rule applies here.
google have no solution. they only have loud voice. so they use it.

However, how is that any

Quote:
However, how is that any worse than using articles from those damn article farms?

Help me out here, this subject is full of contradiction for me.

1) Article sites:
I found a site that has a consistent pr7 throughout (I verified on McDar et al), the ONLY links it has are from an article site. The Google algorithm didn't catch the coincidence that all of the authority granted to this site came from one self-centered source.

2) RSS feeds:
How does an RSS feed count as content on my site but writing a wiki article and publishing it on my site is duplicate content?

I'm not being facetious, I really don't get why articles and RSS aren't already in the banned toy box.

(Keep Googlebot away from

(Keep Googlebot away from your search results) + (don't scrape / don't worry about links from scraper sites) + (our algos distinguish artificial links from natural links and detect thin pages) + ... many signals and a bounded context.

Maybe they've figured out how to better identify generated content. This time they didn't deindex it, they've just pulled the passPR-handbrake to avoid precocious reactions. Speculation though, but perfectly explaining -for example- recently vanished old and established sites where lots of inbounds came from, hmmmm, dynamic pages.

>>they've just pulled the

>>they've just pulled the passPR-handbrake to avoid precocious reactions.

Not. I promise :)

>>Not. I promise :)

I've faith in your promise :)

Did anyone else see the comment spam?

A (obviously very brave) comment spammer has spammed that post on the Google blog with some home brewing advice.....

"We all know the effects (and after-effects) of beer...."

Clearly - mario's been into his own products...

:)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.