Google 'hacked our website'

45 comments

A SCHOOL board has won a temporary injunction against the search engine outfit Google.

The schools claim that Google's search engine spider grabbed information they shouldn’t have and posted it on the Interweb.

The data included the names, Social Security numbers and test scores of 619 students which are still available online when the page was removed by the schools.

Here is Full Story.

Evidently it was behind a username and password? uhhh...

Comments

OMG

I called them up and actually got ahold of the web designer; I said that I could diagnose the problem and rattled off a few qualifications. After about 10 minutes I received a call from the web guy.

The main page *was* password protected, he said; so I asked about when cookies expire and whether they contained clear-text passwords (you'd be surprised how common this is). Cookies? He couldn't comprehend how chocolate chips related. So I asked about session ids; yes, he used those on every page. NExt question, how often do session ids expire? Expire? No...session ids were hashes of the username + SSN...they never expired...

The session ids were passed via the URL... I left my business number.

I would guess

I would guess that since the session IDs passed were part of the URL then someone who had the Google Toolbar installed when they visited the site or logged in: perhaps the web designer himself.

Google had to get ahold of those URLs somehow--and I'd first point to the toolbar.

Toolbar

Bingo on the toolbar!!

That is where I would start looking!

MySpace

The particular page that was leaked showed the test results of a particular standardized test.

I can just imagine a 14 yr-old honours student linking her 100% result on her MySpace... hey, it worked for all her friends; and the GOOG.

Google isn't responsible for

Google isn't responsible for a school not having their site set up securely. If anything the parents should be sueing (sp?) the school for not protecting their kids personal information.

More standard than any thing

66 percent of all database driven sites are exploitable via very simple SQL-injection attacks. Meaning, most sites can have their hidden data displayed, deleted, or (the "best") altered wiht very little difficulty.

The thing that thoroughly scares the hell out of me routinely is wondering how any particular site stores passwords. I trust drupal (what TW runs) b/c I run drupal and know how it stores passwords (as a non-reversible hash); but you should NEVER EVER trust sites thoroughly that have a question that lets you be sent your password. A good site will never be able to tell you your password, just reset it.

Still, it scares me in that I *know* that a very large percentage of passworded sites are coded so horribly that any one with database access (authorized or otherwise) can see your password in plaintext. I particularly *hate* it when I misjudge a site and my password is *emailed* back to me.

Good stuff

Good stuff hopeseekr, you should offer your services to other schools, find a security hole, call them up, use this post as an example and there you go, security hole expert!

:-)

Danny has been in email contact with school

SEW story

The root cause of why Google is in court over this is the usual one of a COMPLETE FAILURE of Google to respond to anyone

Quote:
We acted so aggressively with Google because, until the media got involved, we could not get beyond an operator at Google. We could not get operators to connect us with technical support, the legal department, or to anyone higher up in the organization. We were only given an email address to which we could submit a complain - which we did but got no response. Google has a link to submit an emergency request [see here] but on both Thursday and Friday of last week, the link took you to a dead page. Only when the news media submitted its own inquiry to Google did we get a call regarding the situation. And [Google] has been most helpful in working through this situation with us.

This sort of problem with getting through to Google seem familiar to you guys?

the new Google

Gotta love the dead pages. That must be so embarassing.

Sad State of Education

Getting an injunction against Google because you don't know how to use the web properly is yet an additional waste of the taxpayers dollars and red flag that this school shouldn't be permitted to have a website in the first place. They couldn't get beyond an operator at Google yet everything they needed to know was ONLINE in Google's "Webmaster Help Center". This is a typical situation as reading isn't really emphasized in todays schools so use the phone instead as that big Google website has all sorts of scary words on it that may tell them what to do.

They should really be embarassed as they aren't pointing out how bad Google is, or how unresponsive Google is, or that you can't get past the Google operator. What they're pointing out how incredibly STUPID they are since they can't READ or figure out web basics which is bad for a school and heads should roll.

Let's start with the big question in why they didn't just read the removal instructions from Google?

The answer is TEACHERS, but I'll refrain from too much contempt of the sad state of education at this point and stick to the facts.

The site removal instructions [safest bet in this case] from Google require use of the robots.txt file and this so-called educational institution's website doesn't even have a ROBOTS.TXT file on the server.

Perhaps had they installed a robots.txt telling bots like Google to keep out of sensitive areas in the first place they wouldn't have had to sue Google for their own incompetence but it gets better.

Google also has an emergency URL REMOVAL tool plainly posted on Google's website. Alas, poor school full of teachers that can't figure out they need robots.txt to stop those documents from being crawled in the first place probably aren't smart enough to read a couple of simple pages from the Google Help Center either so let's complain publicly that we can't get past the OPERATOR at Google because we're ignorant to the workings of the internet, reading is too hard, but we have lawyers than can read so parents shouldn't be concerned.

Before you claim that Google's removal tools don't work, remember that WebmasterWorld was almost completely delisted in a day.

Now that we've established this bunch is completely incompetent as they don't know basic WEB 101, we are also pretty sure that they wouldn't be smart enough to tag any sensitive documents with NOARCHIVE, NOCACHE, NOINDEX, NOFOLLOW or anything else that would've EASILY stopped Google from adding these documents to their index, because, as we know, they're teachers.

The fact that the login security was a sham isn't surprising as they didn't have the very basics and fundamentals in place so why would we expect the login to be anything more that a placebo?

If the person that wrote the login page just added ONE LINE OF CODE to check to see if it was a POST vs a GET we wouldn't even have this thread.

FWIW, this school is teaching their students a very valuable lesson in today's litigious society which is to sue when they don't understand something. Reading the manual before using might be too hard, so if you get harmed because of your own ignorance just sue, especially when the target has deep pockets.

Finally, let's explore the 100% gross naivety that only assumes Google has this information since it's the only one they could see. We all know there are many creepy crawlers on the internet collecting data for all sorts of purposes so that the odds that only Google has this information is very slim. The most STUPID part was making this very public while the information was still available on Google so everyone could go look for it. Students, brace for identity theft and since the school taught you how to SUE, use this lesson learned wisely and sue that school until there isn't a single brick left to be taken from the building.

Here's an expensive lesson learned for Catawba County Schools:
"IGNORANCE OF THE INTERNET IS NOT AN EXCUSE"

I weep for the future.

GoogleBase and GBuy

Only when the news media submitted its own inquiry to Google did we get a call regarding the situation. And [Google] has been most helpful in working through this situation with us.

So if it takes 2+ days and media attention to get any customer support from Google when dealing with something as serious as published SSNs, what can I expect when the digital camera I buy from someone off of GoogleBase arrives dead? or when my $20 GBuy payment doesn't go through?

I don't see any reason for eBay to be scared until Google realizes that pure retail/consumer transactions require more than just algorithmic customer support.

ranting is for cowards

There is little accountability for hiring decisions in the IT arena for public institutions. There is little if any support for hiring the capable person (a.k.a. the expensive one) and plenty of backlash for paying a high salary. So we get Chief Technology Officers that don't know what auth means or what a hash is, or how to *lead* for a safe environment. And they hire the developers along the same lines..

It is soooo easy to rant against the machine. It is so burdensome to volunteer at your local public school district and listen to those endless PTA meetings, just so you can help avoid this kind of crap.

Rebuttal along the lines of "I paid my dues, it's somebody else's turn " coming in 5....4....3....2....1.....

excuses are lame

Sorry, if people in an educational institution can't learn the basics it's a sad statement on the sad state of education, not a rant. Even most of the lesser paid webmasters know about robots.txt, NOARCHIVE, NOINDEX, NOFOLLOW and so on, so it's not a salary issue.

Also, it's web programming 101 to test your forms for POST to make sure people don't store passwords in links and use GET to login.

Try to derail the thread with PTA nonsense if you will, but this is a blatant case of ignorance, and ignorance of the web OR the law is no excuse as such breaches of confidential information containing SSN's and such in California could even result in criminal charges, not sure about No. Carolina.

There are hundreds of thousands of web pages devoted to these topics all over the place and anyone involved with the internet and websites would have to be lazy and complacent not to know these things. Not to mention hundreds of webmaster forums discussing these very topics daily.

If a 12 year old can figure out the simplest of these simple basics, the fact that professional IT and educators that can't do this is shameful and again, ignorance is no excuse for a paid professional regardless of salary.

Make all the excuses for them you wish, there is none in this case.

Ouch, IncrediBILL I pray

Ouch, IncrediBILL I pray that I never make a mistake you see. My understanding from the brief looking into this I did was that they used a third party tool (from Xerox) and as such they may not have had access or the ability to do much of the stuff you said. Similarly, given the data has been removed (although the link is still present in Yahoo it resolves to nothing) it would be impossible for us as casual observers to derive whether the fault was the school's or the systems. Schools, in my opinion, have a higher duty and trusting data to third parties is unwise in the extreme but Google has a moral duty to help in such cases and support, as usual, seems to have been sadly lacking.

I think we can safely say

I think we can safely say that people who are lacking in certain areas tend to blame others for their own issues; BiLL is right on, embarrassing indeed!

and "hacked" is a really amusing word in this case...HAHA!!!

Xerox vs School

If Xerox built faulty software then they should shoulder some of the blame, but ultimately it was the SCHOOL that selected and deployed the software. When you look at the chain of command, even if the product was faulty, it's your responsibility to test it for fitness of use before loading it with confidential information and putting it online.

A basic security check by looking at the login process and examining the page source of the various pages it displayed would've quickly revealed potential problems and should've been reported to the vendor for being repaired before ever being deployed.

Worse case, even if the software had weak security, had it been in a subdirectory which I suspect it may have been, a simple .htaccess could've been added as a second layer of security and completely lock out any creepy crawlers until the hole was patched. Then students would've had a global password and a personal password, annoying yet secure.

However, for a site that doesn't know about robots.txt an .htaccess file is probably way over their heads.

OOOPS, it's using a Microsoft-IIS/5.0 server!

That explains everything! :-)

Website Security: SNAFU

When well funded national/international firms and government departments lose hardware and data (some managing to do so multiple times) until embarrassed into basic average 'common sense' security improvements I have great sympathy for poorly funded school district IT staff.

But sadly IncrediBill is correct (damn annoying habit) that that same school district IT staff failed badly. And so did the district management; both in their oversight role and in that of providing appropriate tools/training/equipment.

The conversation(s) between the district and Google reception remain private but appear a big G-blunder. Not immediately passing a school district's child id security breach concerns up the ladder is, on its face, reprehensible. However, why the district lawyer(s) would not directly contact the G legal department prior to requesting the injunction (if they did I missed what would be an even bigger G-blunder) I find a missed logical step.

Indeed that entire school district appears to be running on low priority in every department:

From the 23 June 2006 Charlotte Observer: http://www.charlotte.com/mld/observer/news/local/states/north_carolina/counties/catawba/14882656.htm

Catawba County's school superintendents say they're concerned that the new county budget does not do enough to keep the region moving forward educationally.
...
This year's request for computers and library materials for new schools fell nearly $700,000 short. As a result, the district will have to take money meant for technology and media centers at existing schools to supply the new buildings, he said.

Unfortunately restraining the behaviour of bots, including the almighty G, remains solely the responsibility of individual webmasters and system administrators. Hopefully the publicity will cause others to review their security - worse things can be done, and done without notice, through that sort of breach than happened here.

A bot opt-in system sure would be nice though - I think I will try that on for my dream tonight. Maybe not; the letdown on the morrow would be too great.

Huh?

Quote:
Google isn't responsible for a school not having their site set up securely. If anything the parents should be sueing (sp?) the school for not protecting their kids personal information.

How do you figure? So if someone breaks into a house and steals naked pictures of the guy's girlfriend, the guy is at fault for not having a good enough lock on the door?

It's a total double standard. Google needs to be responsible for what they grab from the web. I hope all of you defending Google on this will defend the next hacker who is able to obtain sensitive information and posts it on the web. Although one may be done in bad faith, there is no excuse for either.

reasonable lag time

The practical argument is hard to address. Obviously the IT people and administration have fault, but IMHO so does Google and just as obviously. Hey incrediBill it is illegal to highlight someone elses security flaws, by the way. Crackers get prosecuted under such laws all the time. If nothing else they prosecute under "you should have known better" laws. The old free-speech-but-you-can't-yell-fire-in-a-crowded-theatre concept. Accountability *and* responsibility.

Practically, society needs some protection in the form of lag time for technology advances. Set a deadline for no more ss#'s on the Internet, and help everybody get there. Unfortunately the lobbies will push for "standards" instead, and we'll ust get plausible deniability and our ss#'s on the Internet. That is what's broke, not the fact that some school district can't afford a top tier IT person.

illegal to highlight

Prosecute the school and the press then as nobody knew about this until they took it public.

It was a classic example of what Forest Gump's mom said of "stupid is as stupid does"

society needs some protection in the form of lag time for technology advances

My B.S. alarms just woke up people 3 blocks away as robots.txt, blocking GET from a login form, and these meta tags to stop indexing have been around MANY YEARS now, most of it almost 7 which is when ecommerce and higher security started being an issue.

Nice try, but this sort of issue has been a problem since the 1st web crawler and they have been around as long as most people use the net, therefore I'm back to....

IGNORANCE IS NOT AN EXCUSE!

Breaks in?

So if someone breaks into a house and steals naked pictures

They didnt break in, someone posted a public link, it didn't have NO FOLLOW, the server wasn't properly secured.

Sorry, but it's YOUR server, YOU install the firewall, YOU install the anti-virus, YOU install the anti-spam, and YOU can sure as hell install robots.txt, .htaccess, and anything else you need to keep your content secure.

Security is YOUR problem 100%, take responsibility and stop passing blame.

Google is the internets librarian, not a nanny or a security gaurd.

It's nice to point fingers at deep pockets but that doesn't make the school any less inept.

it didn't have a nofollow?

what, that would have prevented the indexing would it?

Yes the school was unwise to rely on the fact that a page was password protected, but its a pretty common assumption that password protected pages will not be indexed - so perhaps they did their research and followed what they thought was best advice?

Anyway, the issue wasn't that the pages were indexed, it was that there was no response from Google. If you discovered that a load of sensitive pages from your website had accidentally got imdexed and Google were ignoring your e-mails on the subject (not even a response saying they wouldn't do anything) wouldn't you get frustrated?

As is often the case, I cant see that Google have any blame either technically or legally, but that doesn't mean they can run a good customer service department. It sometimes seems they should drop the phd requirement and just hire in someone who's sat on the sharp end of a blue chip complaints department for 20 years or so. They might not be young, geeky or book smart but they could bring a lot of good advice to the mix.

Passing blame

Anyway, the issue wasn't that the pages were indexed, it was that there was no response from Google.

We'll just have to agree to disagree.

If you can't secure a website the issue isn't with Google, it's wirh you.

I guess personal responsibility and testing are all out the window, it's someone else's problem.

OK, I get it, we're never at fault, it always Google.

Did they try the emergency removal for the whole site?
Like I mentioned above, it sure worked for WMW...

Yelling at Google because they didn't jump thru hoops for a single website while answering requests from thousands or tens of thousands of requests per day is just silly and self-serving.

I'll admit that Google probably could, if they don't have it already, help situations like this with an escalation policy for secure data breaches.

no one said its always googles fault.

emergency removal is a temporary measure, that would only have set the issue back six months or so, and would probably have made the whole thing harder to resolve properly.

I certainly didn't say its always googles fault, come on Bill, I spend half my forum life telling newbies that their site isn't indexed because of something they did, something they may have done or something google plainly state. In fact I'm one of their staunchest defenders on privacy and their right to index everything usually but the school aren't claiming Google were wrong to have indexed the data. They're saying that no one from Google could or would help them. Its about the standards of customer service that it may or may not be reasonable to expect from one of the largest companies in the world. Google may not be in a customer facing industry but as they get bigger and now theiy're public from a pure PR basis they have to deal with issues like this or their shareholders will have something to say.

If you (or Google) feel that no response to an e-mail containing the words 'children' 'private information' and 'please help' is a reasonable customer service standard then fine, and if you feel that people should be made to pass a test set by you before they're allowed to publish a website then thats also fine but I don't agree and I'm not goimg to sit here and agree with you just because you should louder than anyone else.

Well...

Gurtie, did you miss where I said Google should make exceptions?

"I'll admit that Google probably could, if they don't have it already, help situations like this with an escalation policy for secure data breaches." so I actually agreed with your point.

The only thing that bugs me about Google, which probably happened this time, is it seems all issues are FIFO and nobody runs a clinic on incoming requests and escalates serious issues in a more timely manner.

However, you know as well as I do, that if they put an email address up for "urgent-issue@google.com" that all the idiot webmasters having indexing issues would use this make it useless for such issues.

Which is why I suggested emergency site removal which would give them some breathing room to deal with Google to resolve the problem and save on legal fees without embarassing themselves like they did.

Kind of torn here.... I

Kind of torn here.... I agree with Bill that the school / authorities generally bear pretty much all the responsibility for the breach. If you put sensitive information on a web server, the assumption has to be that as it is open to the world, reasonable steps should be taken to secure that information.

If the sysadmins were so badly trained that they had no concept of what "reasonable steps" were (which seems to be the case here), then a portion of the blame goes up to the school board etc who have hired badly, and then failed their employees, and the school, and the kids and their families.

However, Google have definitely screwed up here too. Regardless of where the blame for any aspect of this goes, they should have done more. As a couple of other people have noted, for a company with global ambitions, customer service is key. They can't get by with the "we're just poor little geeks" line any more.

As they expand their product offerings, and especially as they move more and more into non-advertising revenue areas, customer service is going to be what ultimately defines their success, or not. Even for their core revenue stream, unless you do millions a year on AdWords, you are unlikely to get anything but the most perfunctory treatment. Even quite large advertisers (say £115k / month, as an example) have trouble pursuing issues, and have the G team fucking around with their account randomly

ok think about it

Look the blame should point towards two places.

1) The School, with crappy coding like that what did they expect.

2) Google, If they just don't care about people contacting them.. especially kids with their social security numbers being plastered online... then they do share the responsiblity.

If a school calls your company and says 'help we don't know what's wrong but all our stuff is online' .. and google's response is nothing.. then that in itself is criminal.

Once GBuy starts...

... with customer service like this Gbuy is sure to bring back the PayPal nightmares from a few years ago.

>The session ids were passed

>The session ids were passed via the URL... I left my business number.

Unbelievable.

I've seen school networks. They are pathetic, usually because the school is too broke to pay someone to set it up properly. And, their equipment is usually 3 to 5 years old minimum.

I'm surprised a large company doesn't just "sponsor" the computers of a school system. They could call it a tax write off and actually get the school decent equipment that is set up properly.

after the trainwreck ...

It's as if after the trainwreck Google stood by and took souvenir pictures to sell to curiosity seekers even as the train crew asked for assistance.

What do I mean by this?

Well, when a trainwreck happens, everyone is/should be expected to pitch in and then let the inevitable finger pointing happen after the fact.

In this case Google *chose* not to respond to inquiries from the school authorities. The trainwreck was the publication of the obtained data, and the pleas for help were the inquiries from the school authorities.

The court order was the authorities ordering Google to put down the camera and get on with the rescue.

The bigger mistake was the choice of Google to not recognise that a problem existed and that it ought to be addressed *expeditiously*. Either no one took the time to understand the problem being communicated, or chose not kick it upstairs, or were under orders not to do so. Any one of these is a problem in it's own way.

As for the matter of opt in versus opt out, why is it that certain posters here have been seen to go on a wild rant about opt out provisions in email spam but opt out is fine for robots?

Opt in might be actually be a great model. Imagine all those spam domain meisters running around trying to find unique ip's for the purpose of using the google addurl.htm form in an attempt to avoid detection as being part of a spam network. 50 domains added in 1 second by one ip might be a bit of a tip off that any moron could understand. PHD? Ok, we might have to make that 100.

great modles of ecommerce

Quote:
Opt in might be actually be a great model.

I can see it pushing down the costs of SEO considerably.

"Get your site listed in Google for just $99.99" becomes "Get your site listed in Google for just $9.99"

So..

Quote:
Sorry, but it's YOUR server, YOU install the firewall, YOU install the anti-virus, YOU install the anti-spam, and YOU can sure as hell install robots.txt, .htaccess, and anything else you need to keep your content secure.

So what you are saying is that if you get hacked, it is your fault and the hackers hold no responsibility? Wouldn't that train of thought open the floodgates?

Hackers

Well in this case they weren't hacked but my point was that their security was completely missing and that's THEIR problem.

FWIW, I went back and did some deeper research on this mess this morning and it looks like the failures and shit security are courtesy of Xerox DocuShare, which appears to be about as secure as hooker's panties as it docushared way more than they should.

People, including me, were blaming the schools for hiring poorly paid IT people that caused this and instead it was Xerox that did it.

Still doesn't absolve the school from not testing this product for it's suitability to the purpose of holding secure data as anyone that tests sites for security would've punched thru this hole in 5 minutes and kicked it back to Xerox for repair.

Why do I get the feeling?

Why do I get the feeling that if my scraper grabbed that data and put it on the web I'd be getting drilled? Just because they are a billion dollar company doesn't absolve them of the fact they posted social security numbers on their servers.

I just have a tough time bashing a public school's IT person for not being an elite security expert. They don't exactly have the same budget to work with as big companies.

Elite?

We aren't talking rocket science security, most any poster of TW could've solved this, and I just pointed out the major hole was XEROX's fault so you don't think XEROX could hire someone smart enough to block GET's from a LOGIN?

ERROR... DOES NOT COMPUTE...

Still

I know it's not rocket science, but you can't blame a public school IT person for not knowing all the flaws in XEROX's stuff. Most of these public school don't offer more than 30k for these kind of jobs. Are you really going to find someone you trust with that kind of data on that salary?

I still think the blame falls with Google. We can bash Xerox, schools, etc., but Google is the one that pulled the data from behind a password protected site. You are giving all hackers a free pass by blaming all issues of compromised data on the security system it was behind. I guess if someone comes in my house tonight and steals my credit cards, I'm at fault for not having better locks on my door.

Huh?

Google just pulled data from a public link, nothing extraordinary.

If Xerox coding has been done properly the login would've stopped them.

If the school had tested the site properly before deploying it they would've had Xerox fix it.

The only thing Google did wrong was not having an adequate procedure in place for letting people yank pages quick enough once the problem was discovered.

yank pages

Quote:
The only thing Google did wrong was not having an adequate procedure in place for letting people yank pages quick enough once the problem was discovered.

Thats a pretty big "ONLY" considering the likliehood of such an event.

?

So Google pulled all this data from a simple link? No toolbar data at all to find these URLs? I guess I'm lost now.

Would you also defend the 19 year old kid who built a scraper and pulled this same info and posted it on his site?

Defend?

I'm not defending anyone.

Supposedly the girl put a link to her site from MySpace with the login embedded.

Maybe you should read the articles about the story and Danny's blog post about it and those article will clear it up.

FYI, I'm a programmer and I'm telling you right now that if there's a security hole it's my fault, not the hackers fault, so let's not beat the locks and doors metaphor to death as shitty programmers write shitty code and hackers take advantage, it's just that simple.

In this case a kid linked a URL with a user name and password to a website and the software wasn't smart enough to stop a GET from authorizing access, Google just crawled it is all, no harm no foul, except they have no reasonable customer support for dealing with these things.

It all boils down to ONE LINE OF CODE to test for a POST in the login page and this thread would never exist.

Scraper

Actually, you bring up a good point as a scraper posting those SSN's on a website in Russia or somewhere outside the US would be untouchable by the school, which brings back my original point about testing the security of the system before you deploy it.

So how many other sites are

So how many other sites are running xero's, I mean xerox's compromised software and unleashing wads of data?

Can you hear the sharks circling?

..

Quote:
Can you hear the sharks circling?

You can't hear sharks circling. You have to wait to see the blood.

oh, the irony....

Man IncrediBILL

Mind if I follow you around and learn to walk on water? ;)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.