West Point: Googling Considered Harmful
While stumbling around the Web this weekend I ran across a paper published by West Point comp. sci. professor Gregory Conti, titled "Googling Considered Harmful" (PDF). A subsequent search for the paper online revealed only a few brief mentions, so I thought it might be of interest, especially as the paper dwells on the inevitable security problems of using Google's free software.[1] Conti's paper isn't anti-Google; he acknowledges that Google search and Google's apps, when combined with the free flow of information on the Internet, contribute to a healthy Internet and economy. But, Conti argues -- and this will be familiar to anyone who follows security news -- the average user isn't aware of the cumulative sum of the individual pieces of data they release to Google. "I liken these pieces of data to micropayments," Conti told me. "It's not just Google, they just happen to be the biggest in the space. And, I think, have the best apps." Conti argues that Google (and its competitors) will likely store this information for long periods of time and analyze it in order to sell advertising. And, since Google is legally bound to act in the best interests of its shareholders, and legally required to respond to government requests for information, the probability that user privacy will be compromised greatly increases over time: Google's business model is based upon targeted advertising and it is probable that they will seek to tie clusters of interactions, across multiple computing platforms, with specific individuals and organizations. This situation is analogous to the use of encryption. Cryptographers typically consider encryption valid for only a period of time due to cryptanalytic advances and increased processing power. Likewise, we should assume our anonymity is only a function of time. Eventually, given enough information disclosing interactions, privacy will be compromised. Current anonymization countermeasures increase the time required for fingerprinting, but ultimately will only delay the inevitable when faced with Google's capabilities. According to Conti, our trust in Google is actually the most dangerous aspect of our relationship with the world's largest search engine. While we assume that Google is a non-malicious entity, they do face a legal requirement to act in the best interests of their shareholders. This fact creates a tension between making a profit and their stated goal of providing long-term value for their end users as well as their informal corporate motto "Don't be evil." It is important to note, despite our assumption, that there have been incidents that call into question their "Don't be evil" philosophy. [ed.: Censoring in China and banning CNET after Elinor Mills' article on Eric Schmidt] While Google, by choosing "Don't be evil" as their corporate motto, has admirably set a very high standard for their company, the fact remains that evil is subjective. In the words of Google CEO Eric Schmidt, "Evil is what Sergey†says is evil." Conti's paper includes an analysis of basic security problems that impact our exchange of data with Google. He also argues that as fingerprinting technologies increase in capability and stores of information on individuals increase in size, we'll reach a tipping point where data breaches will become more and more common. "But it's also in Google's best interests to protect this data," Conti cautions, saying that consumer trust is one of the best assets a company like Google has. The paper offers a few solutions, including the creation of a Firefox plug-in that shows what data you're revealing on each site, continued research into usable anonymous surfing software and the creation of high-quality persona management practices. "If properly executed," Conti writes, "persona management would allow users to portray a wide variety of online personas from anonymous to very sensitive, based upon the requirements of their interaction with a given service." (Sounds a bit like the hologram suits from A Scanner Darkly...) Check out the paper (PDF) for a good read. [1] Now, I can already hear Matt Cutts groaning when he reads this. Papers like this no doubt irk the Googlers who, as Monsieur Cutts alluded to in a recent post, can't understand why people fixate on Google as a security threat while ISPs control more data. The answer probably has to do with brand recognition -- we worry about what people tell us to worry about. |

Comments (4)
While I do understand that Google, and competitors, get to acquire a lot of data that may be stored, I just don't understand why someone has to come up with an article that says, "Google does this" or "Yahoo! does that."
Anything that you put on the internet or uses its channels, such as emails, should be considered public. With that I don't mean that it belongs to everyone but simply that potentially anyone can see it.
If you send an email, which goes through several providers, channels, whatever, even if you are not using Gmail, Yahoo mail, AOL, Hotmail, etc., it's bound to go through somebody's system. So be prepared, if it's confidential send it at your own risk. It shouldn't be this way, but that's the way it is until we work on better software as Conti stated.
An article such as this should really just talk about these issues in a general manner, yes making names if needed, but not focussing on ONE company or two, while the whole internet is structured that way.
I don't disagree with the article, I just don't understand why it's so narrow.
Posted by Elisabetta Bruno | December 20, 2006 4:10 AM
One of the dark sides of the internet in general is the inability of users to fathom how much of their personal information they have given away to each organization they do business with.
If that wasn't bad enough, companies actively solicit as much "demographic" information as possible from each customer, and then turn around and sell the name lists and collected information.
I agree with the author that it is only a matter of time before criminal syndicates hire developers to consolidate, cross-reference or otherwise "data-mine" purchaseable customer lists to build a dossier on potential victims. 2006 has proven that stealing from people on-line is incredibly easy and profitable.
Law enforcement loves this stuff, they don't have to lift a finger until they want to know all about someone. Once they do they will just "ask" for these businesses to "cooperate" with whatever investigation they concoct. Then, presto--your private life is instantly cross-referenced and indexed for them. Result? If you are not snow-white or completely off the grid, you doomed.
Posted by Pat Patterson | December 20, 2006 5:07 PM
This is why the whole software as a service thing is still a long way off. No way am I going to trust Google (or anyone else for that matter) to store my personal data - it's staying on my home computer (running a fully patched Windows OS behind a SPI firewall) where criminals will have a very dificault time getting to it. Google and it's applications have a place, like sharing photos of my kids with my family, but do you think I'm using Google spreadsheet for my bank ledger, or am I using MS Excel? Exactly.
Posted by Joshua Perry | December 27, 2006 1:08 PM
The greatest danger I see in Googling is that a site google comes up with may be a malicious one. So, whenever possible, I examine not the site itself but Google's cached version of it.
Posted by Fred Linton | January 5, 2007 11:26 PM