I’ve always thought that the JSON hack was a truly weird happenstance. For those who don’t quite know it, it goes something like this. A web page you download can run limited code inside your browser. For example, it can animate certain transitions when you click, it can sum up the price of your 3 movie tickets without checking the server, and it can request more data from the server it came from. Importantly, it cannot request data from other servers, because that would open up a slew of security issues (I won’t get into them now.)
However… there are a few exceptions. You can load an image from a different server… but that’s considered okay because the browser will then directly render the image to the user, and the HTML page that loaded the image doesn’t get access to the image itself. For example, theoretically, I could write a web page that includes an image from your local machine (more or less), if I know exactly where it’s stored, but I won’t be able to load that image back to my server. It’s a bit like walking a tight rope, isn’t it? Sounds really close to evil, but you can’t quite figure out anything really wrong with it.
There’s another exception: Javascript code. A page loaded from one server can load javascript code (the stuff that does animations, etc…) from another server… but it only gets to execute it, not actually see the content. This is another one of those tight-rope walks: if a server is distributing files as Javascript, then we’re assuming it’s saying “okay, this can be run by a third party web page.” This is a very cool feature for web applications, because they can now “mash up” data from different sources right in your browser, as long as those sources are distributing their data as Javascript code. In fact, the Javascript “code” is often just a data structure, i.e. an array of dictionaries of arrays, etc… in that case, it’s called JSON (JavaScript Object Notation).
So Joe Walker makes an interesting point: maybe JSON isn’t all that safe, because that Javascript code you’re running is actually dependent on the Javascript API, which may have been maliciously edited by the page that loads the Javascript code. In particular, Joe mentions that, if you redefine the Array constructor, then you can hijack the JSON data structure construction. Indeed! This is very interesting, and, even though it’s been shown before, it’s good to point this out again.
But it’s also important not to go overboard. I don’t agree with Joe’s final recommendation, that JSON should only be used for public data. He’s worried about the use of this kind of attack in combination with Cross-Site Request Forgery…. which is a bit of an odd description since JSON is meant specifically for cross-site requests.
No, I think the point is this: if you happen to output anything from a web server that is valid Javascript, then you better assume that it can be run in a maliciously modified Javascript environment by a third party. So, if you want to prevent cross-site requests, you need the typical CSRF protection: a URL parameter that matches a cookie value. In addition, if you want to allow cross-site private data requests, you should probably only return the JSON data structure that you expect the cross-site requester to get in the end anyways. That way, overriding the Array constructor doesn’t really help the attacker much.
So JSON is weird, but not for quite the reason that Joe mentions, I think. JSON is weird because it puts the onus on web server maintainers to make sure that they’re not accidentally outputting legal Javascript when they don’t mean to enable cross-site requests. That means anyone putting together a web application has to be aware of this weird loophole in the same-origin policy, which is not great. We should probably move to the model Doug Crockford recommends: javascript/JSON should be delivered with a special mime type if it’s to be cross-site evaluated.
That said, private-data JSON is still safe, if used with full awareness of how JSON is actually evaluated.
Comments
One response to “JSON Safety: It’s about the unwitting servers”
[…] is related to the argument I was making a few days ago about the unwitting servers. It needs to be pretty darn clear when a web server is enabling cross-domain requests. We […]