Privacy Advocacy Stunts

Deborah Peel, a well-known patient privacy advocate, and EPIC have joined forces to ask Google some questions about Google Flu Trends. Google is analyzing its search logs to detect flu outbreaks by region, which is super nifty.

Peel and EPIC ask:

There are, however, privacy concerns surrounding this new tool.


In the aggregate, the data reveals useful trends and should be available for appropriate uses. But if disclosed and linked to a particular user, there could be adverse consequences for education, employment, insurance, and even travel. The disclosure of such information could also have a chilling effect on Internet users who may be reluctant to seek out important medical information online if they are concerned that their search histories will be revealed to others… If Google has found a way to ensure that aggregate data cannot be reidentified, it should publish its results.

So this is clearly a stunt meant to scare people who somehow haven’t yet realized that Google has search logs.

If there’s a privacy problem “surrounding this new tool”, then it should be evident from the tool itself. Since data is aggregated at the State level, and since the output is simply an estimate of flu activity for the whole State, there is no privacy risk to speak of. And, Google tells you in detail how FluTrends works.

Of course, Google does have access to your individual search records. So does Yahoo. If they don’t handle that data securely, or if they report individual data to outside entities, then yes there is a privacy problem, potentially a very large privacy problem. But that is completely independent of Flu Trends.

And it’s not like this aggregate data analysis is a new thing for Google: they’ve been analyzing and publishing trends for a while.

I’m all for privacy advocacy, and I do believe that Google needs to improve its commitment to privacy in general, with respect to anonymization of data, disclosure of data resale, and more. But I’m not so sure these privacy advocacy stunts are a good idea, especially on issues where privacy is actually well handled.

UPDATE: I see that Fred Totter is commending Peel and EPIC on this action, saying it reassures him. Interesting. But why is this reassuring? Surely, Google could have been mining data before Flu Trends? What is it about releasing this tool, with its detailed disclosures and explanations, that somehow tickles the privacy bone? Worth a second blog post soon’ish, I think.







One response to “Privacy Advocacy Stunts”

  1. Dissent Avatar

    I think that what “tickles the privacy bone” is the potential expansion into other health-related issues and trends. Substitute “HIV” for “Flu,” and let’s revisit what it says in the Nature letter:

    None of the queries in the Google database for this project can be associated with a particular individual. The database retains no information about the identity, internet protocol (IP) address, or specific physical location of any user. Furthermore, any original web search logs older than 9 months are being made anonymous in accordance with Google’s privacy policy (

    As I read that, there’s a 9-month window during which health queries could potentially be re-identified should a governmental agency detect an outlier or want additional data and obtain a subpoena for logs.

    Nor do I see any explanation of how they anonymize logs. Have you read any methodology or explanation that would enable you to evaluate whether their anonymization procedures are effective and reliable? Remember the AOL debacle over “anonymized” logs.

    You’re right that on some level, this is nothing new, but it becomes of even greater concern when we consider mental illness, HIV, or other conditions that if the individual were identified, could result in stigma or job loss or discrimination. Expansion into health trends by a company that collects identifiable information and is not bound by HIPAA or medical confidentiality warrants some conversation and consideration.

    I’ve sent Google an inquiry about the potential for re-identification during a 9-month window. If/when I get a response, I will post it to my blog.

    Bottom line: although I, too, have disagreed with some of PPR’s positions at times, that doesn’t mean that the questions raised about this tool — and, more importantly to me, its potential applications — are stunts. Better we should deal with these issues in a more transparent and effective privacy-protecting manner now than deal with a potential privacy problem later.

%d bloggers like this: