security is hard, let’s improve the conversation

A few days ago, a number of folks were up in arms over the fact that you can see saved passwords in your Google Chrome settings. Separately, a few folks got really upset about how Firefox no longer provide a user interface for disabling JavaScript. These flare-ups make me sad, because the conversations are often deeply disrespectful, with a tone implying that there was obvious negligence or stupidity involved. There’s too little subtlety in the discussion, not enough respectful exchange.

Security is hard. I don’t mean that you have to work really hard to do the right thing, I mean that “the right thing” is far from obvious. What are you defending against? Does your solution provide increased security in a real-world setting, not just in theory? Have you factored in usability? Is it security theater? And is security theater necessarily a bad thing?

These are subtle discussions. Let’s discuss openly and respectfully. Let’s ask questions, understand threat model differences, and contribute to improving security for real. In particular, let’s take into account typical user behavior, which can easily negate the very best security in favor of convenience.

Let’s talk examples.

writing your passwords down

Recently, I had to create a brand new complicated password. I pulled out a sheet of paper, thought of a password, wrote it down, and put the piece of paper in my wallet. Someone said to me “did you just write that password down?” I said yes. The snarky response came back: “you should never write passwords down.” Maybe you’ve said this yourself, to a relative, friend, or co-worker?

Except it’s not that simple. Bruce Schneier recommends writing down your passwords so you’re not tempted to use one that’s too simple in order to remember it. Oftentimes, you should be more worried about the remote network attacker than people who have physical access to your machine.

But don’t feel bad about it. You’re not stupid for telling your poor aging parents to pick long impossible-to-remember passwords and then never write them down. That’s what many experts said for years. This stuff is hard. It’s worth discussing, exploring, and finding the appropriate balance of security and convenience for the application at hand. The answer won’t be the same for everyone and everything.

Google Chrome passwords

Yes, it’s true, you can, in a few seconds, view in cleartext all the passwords saved within a Google Chrome browser. But did you know you can do it in Firefox and Safari, too? With just about the same number of clicks? Are you having second thoughts about your immediate gut reaction of pure disgust at Chrome’s apparent sloppiness?

There are good reasons why you might legitimately want to read your passwords out of your browser. Most of the time, if you give your computer to someone you don’t trust, you’re kind of screwed anyways. But it’s subtle. It’s not quite the same thing to have access to your computer for a few minutes and to actually have your password. In the first case, someone can mess with your Facebook profile for a few seconds. In the second, they can get your password and log in as you on a different machine, wreaking havoc on your life for an extended period of time. So maybe it’s worth a discussion, maybe you can’t play security reductionism. Maybe the UI to view your passwords shouldn’t exist.

Would that then be security theater, since, as Adrienne Felt points out, you can install an extension that opens up a bunch of tabs and lets the password manager auto-fill them all, then steals the actual passwords? Maybe. It’s worth a discussion. In fact I like the discussion Adrienne, Joe, and I are having: it’s respectful and balanced, though limited by Twitter.

Is this fixed by Firefox’s Master Password? Sort of, if you believe that addressing the problem for a tiny percentage of the population is a “solution,” and if you assume those users will know to quit their browser every time they leave their computer unattended. Still, it’s worth pointing out the Master Password solution and evaluating its real-world efficacy.

Disabling Javascript in Firefox

As of version 23, Firefox has removed the user interface that lets a user turn off Javascript, and some folks call that lame. Why is Firefox removing user choice?

OK, so let’s consider the average Web user. Do they know what “disabling Javascript” does? If they do, is it much harder for them to use an add-on like NoScript? If they don’t, what is the benefit of offering that option, knowing that too many options is always a bad thing? Some people believe Javascript is so integral to the modern Web that disabling it is as sensible as disabling images, iframes, or the audio tag. Others believe the Web should always gracefully degrade and be fully functional without Javascript.

This is a very reasonable discussion to have. The answer isn’t obvious. My opinion is that Javascript is part of the modern Web, giving users a blunt “disable Javascript” button is practically useless, and add-ons are a fine path if you want to surf the Web with one hand tied behind your back. I have no beef with anyone who disagrees with me. I do have a beef with people who call this decision obviously stupid and see only downsides.

The Web is not that simple. Security is not that simple. And people, most importantly, are not that simple.

Let’s build a better way to discuss security. Never disrespectful, always curious. That’s how we improve security for everyone.

Identity Systems: white labeling is a no-go

There’s a new blog post with some criticism of Mozilla Persona, the easy and secure web login solution that my team works on. The great thing about working in the open at Mozilla is that we get this kind of criticism openly, and we respond to it openly, too.

The author’s central complaint is that the Persona brand is visible to the user:

It [Persona] needs white-labeling. I know that branding drives adoption, but showing the Persona name on the login box at all is too much; it needs to be transparent for the user. Most of the visits to any website are first-time visits, which means the user is seeing your site/brand for the first time. Introducing another brand at the sign-up point is a confusing distraction to the user.

The author compares Persona to Stripe, the payment company with a super-easy-to-use JavaScript API, which lets a web site display a payment form with no trace of the Stripe brand, and all the hard credit-card processing work is left to the Stripe service.

This is an interesting point, but unfortunately it is wrong for an Identity solution. Consider if Persona were fully white-labeled, integrated into the web site’s own pages, with no trace of the Persona system visible to the user. What happens then? Two possibilities:

  1. no user state is shared between sites: users create a new account on every site that uses Persona. The site doesn’t have to do the hard work of password storage, it can let Persona handle this. There’s no benefit to the user: every web site looks independent from the others, with its own account and password. And while this is incrementally better than having web sites store passwords themselves, that increment is quite small: web sites tend to use federated authentication solutions if they can lower the friction of users signing up. If users still have to create accounts everywhere, friction is high, and the benefit to the web site is small.
  2. user state is shared between sites: users don’t have to create new accounts at every web site, they can use their existing single Persona account, but now they have no branding whatsoever to indicate this. So, are users supposed to type in the same Persona password on every site they see? Are they supposed to feel good about seeing their list of identities embedded within a brand new site they’ve never seen before, with no indication of why that data is already there? This is a recipe for disastrous phishing and a deeply jarring user experience.

So what about Stripe? With Stripe, the user retypes their credit-card number at every web site they visit. That makes sense because the hard part of payment processing for web sites isn’t so much the prompting for a credit card, it’s the actual payment processing in the backend. And, frankly, it would be quite jarring if you saw your credit card number just show up on a brand new web site you’ve never visited before.

But identity is different. The hard part is not the backend processing, it’s getting the user to sign up in the first place, and for that you really want the user to not have to create yet another account. Plus, if you’re going to surface the user’s identity across sites, then you *have* to give them an indication of the system that’s helping them do that so they know what password to type in and why their data is already there. And that’s Persona. Built to provide clear benefits to users and sites.

By the way, though we need some consistent Persona branding to make a successful user experience, we don’t need the Persona brand to be overbearing. Already, with Persona, web sites can add a prominent logo of their choosing to the Persona login screen. And we’re working on new approaches that would give sites even more control over the branding, while giving users just the hint they need to understand that this is the same login system they trust everywhere else. Check it out.

connect on your terms

I want to talk about what we, the Identity Team at Mozilla, are working on.

Mozilla makes Firefox, the 2nd most popular browser in the world, and the only major browser built by a non-profit. Mozilla’s mission is to build a better Web that answers to no one but you, the user. It’s hard to overstate how important this is in 2012, when the Web answers less and less to individual users, more and more to powerful data silos whose interests are not always aligned with those of users.

To fulfill the Mozilla mission, the browser remains critical, but is no longer enough. Think of the Web’s hardware and software stack. The browser sits in the middle [1], hardware and operating system below it, cloud services above it. And the browser is getting squeezed: mobile devices, which outnumber desktop computers and are poised to dominate within a couple of years, run operating systems that limit, through technical means or bundling deals, which browser you can use and how you can customize their behavior. Meanwhile, browsers are fast becoming passive funnels of user data into cloud services that offer too little user control and too much lock-in.

Mozilla is moving quickly to address the first issue with Boot2Gecko, a free, open, and Web-based mobile operating system due to launch next year. This is an incredibly important project that aims to establish true user choice in the mobile stack and to power-charge the Open Web by giving HTML5 Apps new capabilities, including camera access, dialing, etc.

The Mozilla Identity Team is working on the top of the stack: we want users to control their transactions, whether using money or data, with cloud services. We want you to connect to the Web on your terms. To do that, we’re building services and corresponding browser features.

We’re starting with Persona, our simple distributed login system, which you can integrate into your web site in a couple of hours — a good bit more easily than our competitors. Persona is unique because it deeply respects users: the only data exchanged is that users wish to provide. For example, when you use Persona to sign into web sites, there is no central authority that learns about all of your activity.

From Persona, we’ll move to services connected to your identity. We’ll help you manage your data, connect the services that matter to you, all under your full control. We want to take user agency, a role typically reserved for the browser sitting on your device, into the cloud. And because we are Mozilla, and all of our code and protocols are open, you know the services will build will always be on your side.

All that said, we know that users pick products based on quality features, not grand visions. Our vision is our compass, but we work on products that fulfill specific user and developer needs today. We will work towards our vision one compelling and pragmatic product at a time.

The lines between client, server, operating system, browser, and apps are blurring. The Web, far more than a set of technologies, is now a rapidly evolving ecosystem of connections between people and services. The Mozilla Identity Team wants to make sure you, the user, are truly in control of your connections. We want to help you connect on your terms. Follow us, join us.


[1] David Ascher spoke about this in his post about the new Mozilla a few months ago.

encryption is (mostly) not magic

A few months ago, Sony’s Playstation Network got hacked. Millions of accounts were breached, leaking physical addresses and passwords. Sony admitted that their data was “not encrypted.”

Around the same time, researchers discovered that Dropbox stores user files “unencrypted.” Dozens (hundreds?) closed their accounts in protest. They’re my confidential files, they cried, why couldn’t you at least encrypt them?

Many, including some quite tech-savvy folks, were quick to indicate that it would have been so easy to encrypt the data. Not encrypting the data proved Sony and Dropbox’s incompetence, they said.

In my opinion, it’s not quite that simple.

Encryption is easy, it’s true. You can download code that implements military-grade encryption in any programming language in a matter of seconds. So why can’t companies just encrypt the data they host and protect us from hackers?

The core problem is that, to be consumable by human users, data has to be decrypted. So the decryption key has to live somewhere between the data-store and the user’s eyeballs. For security purposes, you’d like the decryption key to be very far from the data-store and very close to the user’s eyeballs. Heck you’d like the decryption key to be *inside* the user’s brain. That’s not (yet) possible. And, in fact, in most cases, it isn’t even practical to have the key all that far from the data-store.

encryption relocates the problem

Sony needs to be able to charge your credit card, which requires your billing address. They probably need to do that whether or not you’re online, since you’re not likely to appreciate being involved in your monthly renewal, each and every month. So, even if they encrypt your credit card number and address, they also need to store the decryption key somewhere on their servers. And since they probably want to serve you an “update your account” page with address pre-filled, that decryption key has to be available to decrypt the data as soon as you click “update my account.” So, if Sony’s web servers need to be able to decrypt your data, and hackers break into Sony’s servers, there’s only so much protection encryption provides.

Meanwhile, Dropbox wants to give you access to your files everywhere. Maybe they could keep your files encrypted on their servers, with encryption keys stored only on your desktop machine? Yes… until you want to access your files over the Web using a friend’s computer. And what if you want to share a file with a friend while they’re not online? Somehow you have to send them the decryption key. Dropbox must now ask its users to manage the sharing of these decryption keys (good luck explaining that to them), or must hold on to the decryption key and manage who gets the key…. which means storing the decryption keys on their servers. If you walk down the usability path far enough – in fact not all that far – it becomes clear that Dropbox probably needs to store the decryption key not too far from the encrypted files themselves. Encryption can’t protect you once you actually mean to decrypt the data.

The features users need often dictate where the decryption key is stored. The more useful the product, the closer the decryption key has to be to the encrypted data. Don’t think of encryption as a magic shield that miraculously distinguishes between good and bad guys. Instead, think of encryption as a mechanism for shrinking the size of the secret (one small encryption key can secure gigabytes of data), thus allowing the easy relocation of the secret to another location. That’s still quite useful, but it’s not nearly as magical as many imply it to be.

what about Firefox Sync, Apple TimeMachine, SpiderOak, Helios, etc.

But but but, you might be thinking, there are systems that store encrypted data and don’t store the decryption key. Firefox Sync. Apple’s TimeMachine backup system. The SpiderOak online backup system. Heck, even my own Helios Voting System encrypts user votes in the browser with no decryption keys stored anywhere except the trustees’ own machines.

It’s true, in some very specific cases, you can build systems where the decryption key is stored only on a user’s desktop machine. Sometimes, you can even build a system where the key is stored nowhere durably; instead it is derived on the fly from the user’s password, used to encrypt/decrypt, then forgotten.

But all of these systems have significant usability downsides (yes, even my voting system). If you only have one machine connected to Firefox Sync, and you lose it, you cannot get your bookmarks and web history back. If you forget your Time Machine or SpiderOak password, and your main hard drive crashes, you cannot recover your data from backup. If you lose your Helios Voting decryption key, you cannot tally your election.

And when I say “you cannot get your data back,” I mean you would need a mathematical breakthrough of significant proportions to get your data back. It’s not happening. Your data is lost. Keep in mind: that’s the whole point of not storing the decryption key. It’s not a bug, it’s a feature.

and then there’s sharing

I alluded to this issue in the Dropbox description above: what happens when users want to share data with others? If the servers don’t have the decryption key, that means users have to pass the decryption key to one another. Maybe you’re thinking you can use public-key encryption, where each user has a keypair, publishes the public encryption key, and keeps secret the private decryption key? Now we’re back to “you can’t get your data back” if the user loses their private key.

And what about features like Facebook’s newsfeed, where servers process, massage, aggregate, and filter data for users before they even see it? If the server can’t decrypt the data, then how can it help you process the data before you see it?

To be clear: if your web site has social features, it’s very unlikely you can successfully push the decryption keys down to the user. You’re going to need to read the data on your servers. And if your servers need to read the data, then a hacker who breaks into the servers can read the data, too.

so the cryptographer is telling me that encryption is useless?

No, far from it. I’m only saying that encryption with end-user-controlled keys has far fewer applications than most people think. Those applications need to be well-scoped, and they have to accompanied by big bad disclaimers about what happens when you lose your key.

That said, encryption as a means of partitioning power and access on the server-side remains a very powerful tool. If you have to store credit card numbers, it’s best if you build a subsystem whose entire role is to store credit-card numbers encrypted, and process transactions from other parts of your system. If your entire system is compromised, then you’re no better off than if you hadn’t taken those precautions. But, if only part of your system is compromised, encryption may well stop an attacker from gaining access to the most sensitive parts of the system.

You can take this encryption-as-access-control idea very far. An MIT team just published CryptDB, a modified relational database that uses interesting encryption techniques to strongly enforce access control. Note that, if you have the password to log into the database, this encryption isn’t going to hide the data from you: the decryption key is on the server. Still, it’s a very good defense-in-depth approach.

what about this fully homomorphic encryption thing?

OK, so I lied a little bit when I talked about pre-processing data. There is a kind of encryption, called homomorphic encryption, that lets you perform operations on data while it remains encrypted. The last few years have seen epic progress in this field, and it’s quite exciting…. for a cryptographer. These techniques remain extremely impractical for most use cases today, with an overhead factor in the trillions, both for storage and computation time. And, even when they do become more practical, the central decryption key problem remains: forcing users to manage decryption keys is, for the most part, a usability nightmare.

That said, I must admit: homomorphic encryption is actually almost like magic.

the special case of passwords

Passwords are special because, once stored, you never need to read them back out, you only need to check if a password typed by a user matches the one stored on the server. That’s very different than a credit-card number, which does need to be read after it’s stored so the card can be charged every month. So for passwords, we have special techniques. It’s not encryption, because encryption is reversible, and the whole point is that we’d like the system to strongly disallow extraction of user passwords from the data-store. The special tool is a one-way function, such as bcrypt. Take the password, process it using the one-way function, and store only the output. The one-way function is built to be difficult to reverse: you have to try a password to see if it matches. That’s pretty cool stuff, but really it only applies to passwords.

So, if you’re storing passwords, you should absolutely be passing them through a one-way function. You could say you’re “hashing” them, that’s close enough. In fact you probably want to say you’re salting and hashing them. But whatever you do, you’re not “encrypting” your passwords. That’s just silly.

encryption is not a magic bullet

For the most part, encryption isn’t magic. Encryption lets you manage secrets more securely, but if users are involved in the key management, that almost certainly comes at the expense of usability and features. Web services should strongly consider encryption where possible to more strictly manage their internal access controls. But think carefully before embarking on a design that forces users to manage their keys. In many cases, users simply don’t understand that losing the key means losing the data. As my colleague Umesh Shankar says, if you design a car lock so secure that locking yourself out means crushing the car and buying a new one, you’re probably doing it wrong.

Online Voting is Terrifying and Inevitable

Voting online for public office is a terrifying proposition to most security experts. The paths to subversion or failure are many:

  1. the server could get overwhelmed by attackers, preventing voting altogether
  2. the server could get hacked and the votes changed surreptitiously
  3. the users’ machines could get compromised by a virus, which would then flip votes as it chooses with little or no trace
  4. even if somehow we secure the entire digital channel, there’s still the issue of your spouse looking over your shoulder, strongly suggesting you vote a certain way

So, terrifying. And yet, I’m now pretty sure it is inevitable.

What human activity isn’t on the Internet?

Today, we bank online, deposit checks and even pay vendors with our smart phones. We can change our mailing address with the postal service and pay parking tickets with our local governments online. We can shop online, socialize online, and debate with our Presidential candidates online. Newt Gingrich announced his Presidential campaign on Twitter.

Just about everyone now carries an Internet-connected personal device. The Internet is everywhere you want it, and just about everywhere you don’t. People are starting to experience the world through augmented reality, using online maps and satellite overlays matched with your current location. The Internet is only going to become more omnipresent, faster. Within a few years, it’s hard to imagine any human activity that doesn’t involve the Internet.

And yet, somehow, we expect people to still be voting in person, on paper? We can’t even get users to take SSL certificate warnings seriously, but we’re going to convince them that voting is so special it has to be done in person? I don’t think so.

Don’t grab your pitchfork yet

I’m not arguing that this is how it should be. I’m definitely not saying that we can secure online voting just like we can secure online banking. In fact I’ve made many of the original arguments, in my dissertation and on this blog, shooting down the bogus arguments that go something like “hey, we can secure online banking, surely we can secure online voting!” No, we don’t know how to do that.

What I’m saying is that, regardless of the state of online voting security, I think it’s a losing battle to expect voting to remain the only activity we still do in person and on paper. With the Oscars moving to online voting, the Federal Voting Assistance Program making $15M available in grants for activities related to online voting (even if it supposedly doesn’t involve online vote casting), parts of Canada moving to online voting, France considering online voting for its 2M+ expats (more than the margin of victory in the last Presidential election), what you’re hearing is the sound of inevitability.

Enforced Privacy is Dead

There’s another interesting issue, when you think about problem (4): even if we keep voting on paper in person, voting requires enforced privacy: we have to make sure it’s just you in the voting booth, not you plus a coercer. That’s great. Now, how many ballots do you think we’re going to see next year published on Instagram?

We have a deeper problem here due to the now omnipresent Internet. Voluntary privacy is not dead, since users can choose to isolate themselves. But enforced privacy, privacy imposed on the voter, the kind needed to prevent coercion, that’s quite dead. I’m very concerned about what that means for democracy. But again, this is inevitable.

Doing the Best We Can

So, if it’s inevitable, maybe the best we can do is make online voting as secure as possible. We’ll probably have a few disasters, maybe even a few thrown elections. So we’d better start now on the problems we have.

I think we can solve Problem (2) with open-audit, end-to-end voting systems like Helios (but not only Helios, there are others.) I think we can minimize the risk of Problem (1) by moving to a longer voting period (1 week instead of 1 day). I suspect we have to eventually give up on some aspects of (4), whether or not we do online voting, though some technical tricks might make voter coercion a good bit more difficult (it’s never completely impossible). The hardest problem is (3): we have no way of ensuring that people are using trustworthy software that captures their intent properly.

Again, I’m not endorsing online voting for public office. I’m saying it’s inevitable, and it’s time to face that inevitability.

Importance of the User Agent and why I joined Mozilla

This issue of trustworthy user software is a much larger problem than voting. As human activity increasingly moves online, the central question is: what software is truly on the side of the user? How does the user know for sure that the software they’re using is their true agent? There’s only one piece of Internet architecture today that can be the user’s true agent, and that’s the Web browser (which technologists call the User Agent, unsurprisingly.) And, among the web browsers, there’s one that particularly stands out as the ultimate user agent, backed by a company whose mission is focused on the user and only the user.

That’s why I joined Mozilla. Because for voting and beyond, everything people do is online or soon to be online, and users better have an agent on their side. The best agent users can get today is Firefox, and I hope to contribute to making it an even better user agent in the next few years.

[It’s worth noting that Mozilla has no intention of getting into the voting business, that’s just my personal interest.]

OK, you may now get out your pitchfork.

(your) information wants to be free

A couple of weeks ago, Epsilon, an email marketing firm, was breached. If you are a customer of Tivo, Best Buy, Target, The College Board, Walgreens, etc., that means your name and email address were accessed by some attacker. You probably received a warning to watch out for phishing attacks (assuming it wasn’t caught in your spam filter).

Yesterday, the Sony Playstation Network of 75 million gamers was compromised. Names, addresses, and possibly credit cards were accessed by attackers. This may well be the largest data breach in history.

And a few days ago, it was discovered that iPhones keep track of your location over extended periods of time and copy that data to backups, even if you explicitly tell your iPhone not to track your location. There are believable claims that law enforcement has already used this information without a court order. Apple now says this was a bug and they’re fixing it.

In 1984, Stewart Brand famously said that information wants to be free. John Perry Barlow reiterated it in the early 90s, and added “Information Replicates into the Cracks of Possibility.” When this idea was applied to online music sharing, it was cool in a “fight the man!” kind of way. Unfortunately, information replication doesn’t discriminate: your personal data, credit cards and medical problems alike, also want to be free. Keeping it secret is really, really hard.

I get the sense that many think Epsilon and Sony were stupidly incompetent, and Apple was evil. This fails to capture the nature of digital data. It’s just incredibly hard to secure data when one failure outweighs thousands of successes. In the normal course of development, data gets copied all over the place. It takes a concerted effort to enumerate the places where data end up, to design defensively against data leakage, and to audit the code after the fact to ensure no mistakes were made. One mistake negates all successes.

Here’s one way to get an intuitive feel for it: when building a skyscraper, workers are constantly fighting gravity. One moment of inattention, and a steel beam can fall from the 50th floor, turning a small oversight into a tragedy. The same goes for software systems and data breaches. The natural state of data is to be copied, logged, transmitted, stored, and stored again. It takes constant fighting and vigilance to prevent that breach. It takes privacy and security engineering.

The kicker is that, while it’s unlikely to get into the business of building skyscrapers by accident, it’s incredibly easy to find yourself storing user data long before you’ve laid out decent privacy and security practices: Sony built game consoles, and then one day they were suddenly storing user data. It’s also far too common for great software engineers to deceive themselves into thinking that securing user data is not so hard, because hey, they would never be as stupid as those Sony engineers.

So, am I excusing Epsilon, Sony, and Apple? Not at all. But if we keep thinking that they were just stupid/evil, then we are far from understanding and fixing the problem.

I’ve just finished reading Atul Gawande’s The Checklist Manifesto, which I strongly recommend. As industries mature (flying airplanes, practicing medicine, building complex software systems,…), they must build in processes to counteract inevitable human weaknesses. There’s bound to be resistance from experienced practitioners who see the introduction of process as insulting to their craft. Programmers are, in this sense, a lot like doctors. But it’s time to stop being heroes and start being professionals. Storing user data safely is easy until it’s not.

We are constantly fighting nature to meet our stated goals: we don’t want buildings to fall, disease to kill us, or private information to leak. For a little while, it’s okay to fail catastrophically and act surprised. But eventually, these failures are no longer surprising, they’re just negligent. That time of transition for software architects is now. Every company that dabbles in user data should assign a dedicated security and privacy team whose sole responsibility is to protect user data. We will not eliminate all failures, but we can do much, much better.

intelligently designing trust

For the past week, every security expert’s been talking about Comodo-Gate. I find it fascinating: Comodo-Gate goes to the core of how we handle trust and how web architecture evolves. And in the end, this crisis provides a rare opportunity.

warning signs

Last year, Chris Soghoian and Sid Stamm published a paper, Certified Lies [PDF], which identified the very issue that is at the center of this week’s crisis. Matt Blaze provided, as usual, a fantastic explanation:

A decade ago, I observed that commercial certificate authorities protect you from anyone from whom they are unwilling to take money. That turns out to be wrong; they don’t even do that much.

A Certificate Authority is a company that your web browser trusts to tell it who is who on the Internet. When you go to https://facebook.com, a Certificate Authority is vouching that, yes, this is indeed Facebook you’re talking to directly over a secure channel.

What Chris and Sid highlighted is an interesting detail of how web browsers have chosen to handle trust: any Certificate Authority can certify any web site. That design decision was reasonable in 1994, when there were only two Certificate Authorities and the world was in a rush to secure web transactions. But it’s not so great now, where a Certificate Authority in Italy can delegate its authority to a small reseller, who can then, in turn, certify any web site, including Facebook and Gmail, using more or less the level of assurance the small reseller sees fit.

what happened

It looks like someone from Iran hacked into one of the small resellers three degrees of delegation away from Comodo to issue to some unknown entity (the Iranian government?) certificates for major web sites, including Google and Microsoft. This gave that entity the power to impersonate those web sites, even over secure connections indicated by your browser padlock icon. It’s important to understand that this is not Google or Microsoft’s fault. They couldn’t do anything about it, nor could they detect this kind of attack. When Comodo discovered the situation, they revoked those certificates… but that didn’t do much good because the revocation protocol does not fail safely: if your web browser can’t contact the revocation server, it assumes the certificate is valid.

a detour via Dawkins, Evolution, and the Giraffe

Richard Dawkins, the world-famous evolutionary biologist, illustrates the truly contrived effects of evolution on a giraffe. The laryngeal nerve, which runs from the brain to the larynx, takes a detour around the heart. In the giraffe, it’s a ludicrous detour: down the animal’s enormous neck, around the heart, and back up the neck again to the larynx, right near where the nerve started to begin with!

If you haven’t seen this before, you really need to spend the 4 minutes to watch it:

In Dawkins’s words:

Over millions of generations, this nerve gradually lengthened, each small step simpler than a major rewiring to a more direct route.

and we’re back

This evolution is, in my opinion, exactly what happened with certificate authorities. At first, with only two certificate authorities, it made sense to keep certificate issuance as simple as possible. With each added certificate authority, it still made no sense to revamp the whole certification process; it made more sense each time to just add a certificate authority to the list. And now we have a giraffe-scale oddity: hundreds of certificate authorities and all of their delegates can certify anyone, and it makes for a very weak system.

This isn’t, in my mind, a failure of software design. It’s just the natural course of evolution, be it biology or software systems. We can and should try to predict how certain designs will evolve, so that we can steer clear of obvious problems. But it’s very unlikely we can predict even a reasonable fraction of these odd evolutions.

the opportunity

So now that we’ve had a crisis, we have an opportunity to do something that Nature simply cannot do: we can explore radically redesigned mechanisms. We can intelligently design trust. But let’s not be surprised, in 15 years, when the wonderful design we outline today has evolved once again into something barely viable.

taking further example from nature?

Nature deals with this problem of evolutionary dead-ends in an interesting way: there isn’t just one type of animal. There are thousands. All different, all evolving under slightly different selection pressures, all interacting with one another. Some go extinct, others take over.

Should we apply this approach to software system design? I think so. Having a rich ecosystem of different components is better. We shouldn’t all use the same web browser. We shouldn’t all use the same trust model. We should allow for niches of feature evolution in this grand ecosystem we call the Web, because we simply don’t know how the ecosystem will evolve. How do we design software systems and standards that way? Now that’s an interesting question…

the difference between privacy and security

Facebook today rolled out new security features, both of which are awesome: SSL everywhere, and social re-authentication. True, SSL everywhere should probably be a default, even though I continue to believe that the cost is significantly underestimated by many privacy advocates. Regardless, this announcement is great news.

The only nitpick I have, and I point it out because I think it’s significant in Facebook’s case, is that the announcement confuses privacy and security. The first paragraph mentions Data Privacy Day, then the general concept of controlling your data, then transitions to the new security features. But those are quite different.

Security is about stopping the bad guys from stealing your data. Privacy is about controlling the good guys’ handling of your data. (Ron Rivest is said to have phrased this most eloquently, but I can’t find his quotation.)

So, SSL and social re-authentication provide security because they prevent bad guys from seeing your network traffic at the coffee shop or stealing your login. That’s fantastic, but it has little to do with privacy. If Facebook wanted to celebrate Data Privacy Day specifically, they might consider giving users more control over their data on Facebook. Maybe letting users control who gets to tag them in photos (i.e. not my stalker). Or letting users indicate fields by which advertisers cannot target them (i.e. sexual orientation.) Those would be privacy features.

I don’t mean to knock Facebook’s announcement: it’s great. But it’s about security, not privacy.

Crisis in the Java Community… could they have used a secret-ballot election?

There is a bit of a crisis in the Java community: the Apache Foundation just resigned its seat on the Java Executive Committee, as did two individual members, Doug Lea and Tim Peierls. From what I understand, the central issue appears to be that Oracle, the new Java “owner” since they acquired Sun Microsystems, is paying lip service to the Java Community while taking the language and, more importantly, its licensing, into the direction they prefer, which doesn’t appear to be very open-source friendly.

That said, I’m not a Java Community expert, so I won’t comment much more on this conflict, other than to say, wait a minute, what’s this from Tim Peierls’s resignation note?

Several of the other EC members expressed their own disappointment while voting Yes. I’m reasonably certain that the bulk of the Yes votes were due to contractual obligations rather than strongly-held principles.

Wait a minute, the Executive Committee votes by public ballot? They’re influenced by contractual obligations? That’s fascinating, and that’s hardly democratic! It means that, even where standards bodies are concerned, the secret ballot might be a very interesting tool.

There are arguments against the secret ballot in this case, of course: maybe the Executive Committee members are representative of the Java Community, and as such they should serve their constituents? Much like legislators, their votes should be public so the community can decide whether or not to reelect them? In that case, contractual obligations to vote a certain way should be strictly disallowed or required to be published along with the vote… To whom are these Executive Committee members accountable? To themselves as well-intentioned guides of the Java community? To the people who elected them? It’s difficult to have it both ways, since one requires a secret ballot, and the other a public ballot.

Maybe the right solution is to publish all comments, but keep the ballots secret? There’s always a chance that a truly hypocritical member would consistently vote differently than their publicly stated opinions, but I’m not sure that risk is worse than the problems the Java Community just faced with what appears to be anything but a democratic vote. In a tough spot like this one, it seems to me that Executive Committee members should be able to vote their conscience without fear of retribution.

(Oh, and if the Java community is looking for a secure voting system, I might have a suggestion.)

OK, let’s work to make SSL easier for everyone

So in the wake of the FireSheep situation, which I described yesterday, the tech world is filled with people talking past each other on one important topic: should we just switch everything over to SSL?

As I stated yesterday, I don’t think that’s going to happen anytime soon. I would love to be wrong, because certainly if we could switch to SSL for everything, the Web would be significantly more secure. I just don’t think it’s going to be that easy. But let’s explore this a bit, because I think most people agree that there would be tremendous benefits.

A number of folks are saying “SSL is too expensive.” Others are saying “Google did it, they say it’s 1% overhead, you’re lying.” The main reference people are using for that latter claim is a fascinating presentation by Adam Langley of Google entitled Overclocking SSL. The gist of it is that, using only software, Google gets the overhead of SSL to be 1% CPU and 2% network. That sounds pretty cheap. That said, I’m skeptical. I’m far from the SSL configuration expert, of course, but I don’t think Adam Langley’s presentation paints a complete picture of the situation:

  1. per-request overhead, or per-user-visit overhead? When Google says “1% of CPU and 2% of network,” do they take into account the significantly increased number of requests due to reduced browser caching? Specifically, when going over SSL, browsers tend not to cache things like JavaScript files, images, etc… So when you click from page to page, your browser re-downloads a whole bunch of additional files on each click that they would not download if the same site were visited over HTTP. The server has no control over this. So, I suspect Google is looking at each request they get, and saying the SSL portion accounts for 1% CPU and 2% network. But, they’re probably not telling us how many extra requests overall they’re getting by user visit. I suspect it’s quite a bit higher, on the order of 300-400% the number of total requests per user visit, simply because those additional files don’t get cached. And what’s worse, those un-cached requests are typically large files, like graphics.
  2. fancy protocol tweaks. Google is doing all sorts of fancy things to reduce the complexity of the SSL negotiation. That’s awesome. But it looks like I need to upgrade to an experimental version of Apache to get all those tweaks. Also, some of the recommendations in Adam’s presentation, e.g. “don’t make SSL_write calls with small amounts of data”, are very difficult for typical web developers to address, since they usually don’t control their web pipeline that well. Finally, it looks like Google has patched OpenSSL to be more efficient. Awesome. Can we see that patch? I’m sure Google has done a fantastic job on all of these protocol, algorithmic, and implementation optimizations. But these are not within the reach of most developers, even good developers.

Now, I’m not an SSL-naysayer! I would love to see SSL deployed everywhere. I just think we need to look at the hard data regarding the overhead this will create for companies and for consumers (no caching = increased bandwidth requirements). There’s one way forward I’d love to see happen: Hey Google, how about open-sourcing all of those tweaks in one super awesome SSL proxy that we can all install blindly in front of our HTTP-only sites? This proxy should implement the latest protocol tweaks, buffer the content in appropriately sized chunks, optimize the algorithm negotiation depending on the underlying hardware, etc. Then we can all experiment with this software, see how it affects performance, and make truly informed decisions about switching to SSL everywhere.

As a side note, I wonder whether one reason Google switched its main search UI to AJAX is that it gets around the issue of re-downloading static files over SSL, since JavaScript and graphics stay in place while only the raw results are updated… That is certainly one useful way to keep things snappy over SSL!