On Friday, I attended Social Network Security 2009 at Stanford. This was a fantastic get-together, with some very interesting info from Facebook, Google, Yahoo, Loopt, and the research front. I have some notes, mostly from the first half of the day, at which point my laptop battery ran out. Time to upgrade to the 7-hour battery, I think.
When I walked in (late, sorry), Tao Stein from Facebook was expressing some frustration with social network privacy research as he sarcastically quipped “yawn, people get lost in the fashion. Privacy is boring.” Facebook wants to give people control. They’re concerned about finding the right compromise between fine-grain and coarse-grain, privacy and simplicity. I found it interesting that he didn’t touch on the potential issue with Facebook itself having all of this data. This isn’t specific to Facebook, of course, Google does the same thing: they talk about protecting you from other users, but not so much about protecting you from the site itself.
Facebook Connect, their single sign-on solution, is in use at 15,000 web sites. Wow. OK, so it’s more than SSO, it’s also about bringing your social network with you wherever you go on the web. I suspect they’re going to be increasingly successful with this.
Tao dug into the security issues they face, given that Facebook is “the most viral mechanism that humanity has ever developed. ” He talked about the “efficient abuse frontier”, effectively how the number of users affected and the cost of each compromise are inversely proportional, because the fight between attackers and Facebook makes it so. In other words, if an attacker found a costly, widespread attack, Facebook would devote all of its strength to fight it, so that doesn’t last very long.
Phishing is Facebook’s #1 problem, and it leads to scams from within your social network. Attackers start with a small number of compromised accounts, then by Facebook-based phishing, grow their compromised network until they reach a point where they can efficiently run scams.
Facebook’s API is being used for abuse: build a simple addictive app, get a lot of installs, spam everyone, take the app down, repeat. Tao says there’s an important and tricky research problem: how to keep an open API but deal with bad apps like this. I agree, I think this is an area that needs much more work.
Facebook’s nemesis is Koobface, a polymorphic trojan that compromises Facebook accounts, infects users’ computers, and does all sorts of bad things. Koobface is actively developed: when Facebook defends against Koobface, Koobface responds often within minutes. Because Koobface embeds CAPTCHAs in Windows dialogs to get users to solve CAPTCHAs, someone in the audience suggested watermarked CATPCHAs, so that users will notice that the CAPTCHA is a Facebook-based image, and wonder what’s going on.
So, this got me thinking, and I had an idea along this line: how about animated-GIF CAPTCHAs? The animation would cycle between an instruction screen and the CAPTCHA itself. The instructions would be written graphically in such a way that it’s difficult for a machine to tell the instructions from the CAPTCHA word too easily. Thus, if such an animated CAPTCHA were displayed out of context by an automated attacker, the human would always see the instruction screen, which would say “if you see this CAPTCHA anywhere other than Facebook, you’re being tricked, please report this to Facebook.com/security.” I’m guessing this might be tricky to do, but it seems worth trying: attackers are using images to get around automated filters, let’s use more advanced images to get around automated attackers.
Tao teased us academics a bit more, specifically pointing that we often focus on one objective, when Facebook has to balance many. He wondered if complex passwords were inherently bad, because users then reuse one complex password everywhere, which means one compromise = many compromises. An interesting point.
Tao wants researchers to provide: better user education, better CAPTCHAs, better phishing protection especially with system messages, how to handle Facebook spam that is sent outside the system, methods to enforce data lifetimes (e.g. Vanish?), methods to improve user authentication. All good areas of research, I agree. Especially user auth.
Kevin Binghma from DoD talked briefly about how they’re trying to wrap their brains around the explosion in social network use, given how much social networks are used by troops for morale, family communication, etc. He talked about the technical threat of URL shortening services (interesting!) He also talked about the “cloud services creep,” where DoD members start to use commercial cloud services which DoD then has trouble integrating and securing.
Arvind from Stanford (check out his blog 33bits) talked about Privacy by Design, how it’s not just about access control and technology (absolutely!). He made an interesting point about how we always expect services to go from centralized to de-centralized, but that doesn’t always happen, e.g. payments, IM, web search, so let’s not expect it from social networks necessarily. He also took down the rampant idea that we can somehow “layer” privacy on top of existing social networks. Thank you for that Arvind, I think too much less-than-useful research time has been spent in that direction.
Arvind made some really interesting points about LiveJournal, which has made privacy its central selling point, and for which users pay money. I need to take another look at LiveJournal to better understand this.
YouTube and Yahoo
Palash Nandy from YouTube, and Kun Lui from Yahoo, both said some interesting things, but I had a few work issues to handle before my laptop battery ran out, and so I tuned out for a bit.
Sam from Loopt blew my mind. I have been ignoring location-based privacy, and I now realize what a huge oversight that was. Loopt can uniquely identify people usually from 24 hours of location data, almost always from 1 week, and always from 2 weeks. As a result, they keep only 24 hours of location data, otherwise building a probabilistic model of who you are based on past data. They have significant internal controls to prevent employees from tracking users. Yet they have enough data to detect when a bar is holding a happy hour, when a city wins a sporting event, etc.
They’ve had to build significant privacy controls, including the ability to fake your location for individuals with abusive spouses. They’re beginning to think about how to open up an API to other applications… but again how to prevent those applications from abusing that data access? They want a technical solution but… well I don’t see how that’s possible without DRM-like technology. In any case, I wholeheartedly agree that there is an incredibly important problem here.
But yeah, Loopt blew my mind.
Gerome, whom I’d met once at UMass a few years ago, went into the details of how to enable social science research while respecting privacy, specifically how to apply the ideas of differential privacy to graph queries. Fun stuff.
All in all, a really great day. Thanks Elie for organizing!
UPDATE: I’ve edited this post to clarify the portion about Facebook. My initial stream-of-consciousness write-up did not properly set the context for Tao’s comments, in particular his subtle sarcasm and the fact that Tao wasn’t railing against academics so much as gently prodding us to be more productive. I greatly enjoyed Tao’s talk and his honest, productive criticism, but apparently my write-up gave a very different impression, so … now the text is more in line with my actual impression.