(your) information wants to be free

A couple of weeks ago, Epsilon, an email marketing firm, was breached. If you are a customer of Tivo, Best Buy, Target, The College Board, Walgreens, etc., that means your name and email address were accessed by some attacker. You probably received a warning to watch out for phishing attacks (assuming it wasn’t caught in your spam filter).

Yesterday, the Sony Playstation Network of 75 million gamers was compromised. Names, addresses, and possibly credit cards were accessed by attackers. This may well be the largest data breach in history.

And a few days ago, it was discovered that iPhones keep track of your location over extended periods of time and copy that data to backups, even if you explicitly tell your iPhone not to track your location. There are believable claims that law enforcement has already used this information without a court order. Apple now says this was a bug and they’re fixing it.

In 1984, Stewart Brand famously said that information wants to be free. John Perry Barlow reiterated it in the early 90s, and added “Information Replicates into the Cracks of Possibility.” When this idea was applied to online music sharing, it was cool in a “fight the man!” kind of way. Unfortunately, information replication doesn’t discriminate: your personal data, credit cards and medical problems alike, also want to be free. Keeping it secret is really, really hard.

I get the sense that many think Epsilon and Sony were stupidly incompetent, and Apple was evil. This fails to capture the nature of digital data. It’s just incredibly hard to secure data when one failure outweighs thousands of successes. In the normal course of development, data gets copied all over the place. It takes a concerted effort to enumerate the places where data end up, to design defensively against data leakage, and to audit the code after the fact to ensure no mistakes were made. One mistake negates all successes.

Here’s one way to get an intuitive feel for it: when building a skyscraper, workers are constantly fighting gravity. One moment of inattention, and a steel beam can fall from the 50th floor, turning a small oversight into a tragedy. The same goes for software systems and data breaches. The natural state of data is to be copied, logged, transmitted, stored, and stored again. It takes constant fighting and vigilance to prevent that breach. It takes privacy and security engineering.

The kicker is that, while it’s unlikely to get into the business of building skyscrapers by accident, it’s incredibly easy to find yourself storing user data long before you’ve laid out decent privacy and security practices: Sony built game consoles, and then one day they were suddenly storing user data. It’s also far too common for great software engineers to deceive themselves into thinking that securing user data is not so hard, because hey, they would never be as stupid as those Sony engineers.

So, am I excusing Epsilon, Sony, and Apple? Not at all. But if we keep thinking that they were just stupid/evil, then we are far from understanding and fixing the problem.

I’ve just finished reading Atul Gawande’s The Checklist Manifesto, which I strongly recommend. As industries mature (flying airplanes, practicing medicine, building complex software systems,…), they must build in processes to counteract inevitable human weaknesses. There’s bound to be resistance from experienced practitioners who see the introduction of process as insulting to their craft. Programmers are, in this sense, a lot like doctors. But it’s time to stop being heroes and start being professionals. Storing user data safely is easy until it’s not.

We are constantly fighting nature to meet our stated goals: we don’t want buildings to fall, disease to kill us, or private information to leak. For a little while, it’s okay to fail catastrophically and act surprised. But eventually, these failures are no longer surprising, they’re just negligent. That time of transition for software architects is now. Every company that dabbles in user data should assign a dedicated security and privacy team whose sole responsibility is to protect user data. We will not eliminate all failures, but we can do much, much better.

9 thoughts on “(your) information wants to be free

  1. Pingback: When It Comes to Information Control, Everybody Has a Pet Issue & Everyone Will Be Disappointed

  2. Pingback: Why Your Personal Information Wants to be Free - Techland - TIME.com

  3. I think this attitude is why we will have security breaches. Because you believe it is the natural order of things, like gravity.

    You use the analogy of people working on skyscrapers. Many years of vigilance, and one slip, slap. Please tell us the last time someone fell from a US skyscraper? It stopped happening because the changed practices, not because they shrugged their shoulders with other peoples lives/data.

    Sony doesn’t NEED to store the credit card data. I have run a biz online for 12 years and never have will have a security breach like that. (I am a minnow). Because I don’t store that data. No company should store credit card or private data without at least asking. Restaurants, the grocery store, the gas station… do they store your credit card details? No. They ask each time.

    Your attitude is completely wrong – it is not about more security – it is about less storage. Security will ALWAYS be breached. But if there is nothing on the other side, it doesn’t matter. This not a “time for the software A-Team”. It is a time for making private information like a hot potato. Use what you need, and then delete it, or we will imprison you for negligence.

  4. @twitter-227017107:disqus You misunderstand me if you thing I’m saying that we should “shrug our
    shoulders.” Not at all. I’m saying we should be especially vigilant
    *because* of the natural state of information, which is that it spreads easily.

    My skyscraper analogy and your followup on it is precisely the point: practices in skyscraper construction changed, and so should programming practices. Limiting stored data, as you point out, is one critical aspect.

    But the only way you realize that limiting data collection is important is if you first realize that the nature of data is to be copied and spread. Then you realize how careful you have to be with it.

    In other words, we probably agree almost 100%. I just think we have to acknowledge the natural order of things to reach our goal, which is to protect private information.

  5. Thanks for the response Ben and for seeing through the somewhat agressive tone and recognising it is just passion.

    Our views are very similar, but I have a non technical / non specialist view as a businessman. As you say, the data will “always” be set free – it is a statistical certainty given time.

    Where we differ is that I think the only way to stop data being leaked is to ensure it is not kept – to encourage this, the loss of other persons personal data (to be defind) should be criminalised. That will make it important to companies and when facing the prospect of jail for losing a customers data, corporate governors may well choose not to keep it or keep it as safe as if it was their own daughters’ private details.

    Back to your physical analogy. In underground mining (something I have knowledge on) they introduced thousands of processes in the 80’s to stop the number of deaths. Many saftey plans, programs, courses. Nothing changed. In the early 90’s, they made the company bosses personally responsible with criminal charges if they failed. The bosses were 1 to 1 with every worker ensuring they took no risks and training for youngsters was now priority one etc etc. The reduction in deaths was phenomanal.

    So we both think it needs to be made important, but in my view a very agressive change of responsibility on the management. Not a few security tweeks, and not a few fines of 2-3% of turnover for slippage. (I imagine Steve Jobs thought “ah, small mistake – still, bit of free extra PR”. Sony probably not).

    I don’t expect a response to this – thanks for the last one – I am just a user and only recently realised how much people, like yourself, are doing to protect us who don’t understand or think about the details – sure as hell big business and the politicians don’t have our interest in hand. (If you are on twitter – I can’t find it).

  6. You’re right, increased liability is an important component of the equation. Also important is making sure users have rights to their data so they can demand that certain data flows be stopped. And somehow, in that whole process, we have to realize that there are very good use cases for services storing data about us, and sometimes even trading it with third parties. We can’t completely kill that. I don’t know how we find the balance, but I do know I want the user to have way more control.

  7. Pingback: Konstantinos Stylianou on technological determinism and privacy

  8. Pingback: Europe’s ‘Right to Be Forgotten’: Privacy as Internet Censorship

  9. Pingback: Copyright, Privacy, Property Rights & Information Control: Common Themes, Common Challenges

Comments are closed.