<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Benlog &#187; data</title>
	<atom:link href="http://benlog.com/articles/category/data/feed/" rel="self" type="application/rss+xml" />
	<link>http://benlog.com</link>
	<description>security, privacy, transparency.</description>
	<lastBuildDate>Thu, 17 May 2012 17:09:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>(your) information wants to be free</title>
		<link>http://benlog.com/articles/2011/04/28/your-information-wants-to-be-free/</link>
		<comments>http://benlog.com/articles/2011/04/28/your-information-wants-to-be-free/#comments</comments>
		<pubDate>Thu, 28 Apr 2011 05:46:42 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=1631</guid>
		<description><![CDATA[A couple of weeks ago, Epsilon, an email marketing firm, was breached. If you are a customer of Tivo, Best Buy, Target, The College Board, Walgreens, etc., that means your name and email address were accessed by some attacker. You &#8230; <a href="http://benlog.com/articles/2011/04/28/your-information-wants-to-be-free/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago, Epsilon, an email marketing firm, was breached. If you are a customer of Tivo, Best Buy, Target, The College Board, Walgreens, etc., that means your name and email address were accessed by some attacker. You probably received a warning to watch out for phishing attacks (assuming it wasn&#8217;t caught in your spam filter).</p>
<p>Yesterday, the Sony Playstation Network of 75 million gamers <a href="http://www.reuters.com/article/2011/04/28/us-sony-idUSTRE73R0Q320110428">was compromised</a>. Names, addresses, and possibly credit cards were accessed by attackers. This may well be the largest data breach in history.</p>
<p>And a few days ago, it was discovered that <a href="http://radar.oreilly.com/2011/04/apple-location-tracking.html">iPhones keep track of your location</a> over extended periods of time and copy that data to backups, even if you explicitly tell your iPhone not to track your location. There are believable claims that <a href="http://www.pcmag.com/article2/0,2817,2383945,00.asp">law enforcement has already used this information</a> without a court order. Apple now says this was <a href="http://www.apple.com/pr/library/2011/04/27location_qa.html">a bug and they&#8217;re fixing it</a>.</p>
<p>In 1984, Stewart Brand famously said that <a href="http://en.wikipedia.org/wiki/Information_wants_to_be_free">information wants to be free</a>. John Perry Barlow reiterated it in the early 90s, and added &#8220;<a href="http://www.wired.com/wired/archive/2.03/economy.ideas_pr.html">Information Replicates into the Cracks of Possibility</a>.&#8221; When this idea was applied to online music sharing, it was cool in a &#8220;fight the man!&#8221; kind of way. Unfortunately, information replication doesn&#8217;t discriminate: your <em>personal data</em>, credit cards and medical problems alike, also want to be free. Keeping it secret is really, really hard.</p>
<p>I get the sense that many think Epsilon and Sony were stupidly incompetent, and Apple was evil. This fails to capture the nature of digital data. It&#8217;s just incredibly hard to secure data when one failure outweighs thousands of successes. In the normal course of development, data gets copied all over the place. It takes a concerted effort to enumerate the places where data end up, to design defensively against data leakage, and to audit the code after the fact to ensure no mistakes were made. One mistake negates all successes.</p>
<p>Here&#8217;s one way to get an intuitive feel for it: when building a skyscraper, workers are constantly fighting gravity. One moment of inattention, and a steel beam can fall from the 50th floor, turning a small oversight into a tragedy. The same goes for software systems and data breaches. The natural state of data is to be copied, logged, transmitted, stored, and stored again. It takes constant fighting and vigilance to prevent that breach. It takes <em>privacy and security engineering</em>.</p>
<p>The kicker is that, while it&#8217;s unlikely to get into the business of building skyscrapers by accident, it&#8217;s incredibly easy to find yourself storing user data long before you&#8217;ve laid out decent privacy and security practices: Sony built game consoles, and then one day they were suddenly storing user data. It&#8217;s also far too common for great software engineers to deceive themselves into thinking that securing user data is not so hard, because hey, they would never be as stupid as those Sony engineers.</p>
<p>So, am I excusing Epsilon, Sony, and Apple? Not at all. But if we keep thinking that they were just stupid/evil, then we are far from understanding and fixing the problem.</p>
<p>I&#8217;ve just finished reading <a href="http://gawande.com/the-checklist-manifesto">Atul Gawande&#8217;s The Checklist Manifesto</a>, which I strongly recommend. As industries mature (flying airplanes, practicing medicine, building complex software systems,&#8230;), they must build in processes to counteract inevitable human weaknesses. There&#8217;s bound to be resistance from experienced practitioners who see the introduction of process as insulting to their craft. Programmers are, in this sense, a lot like doctors. But it&#8217;s time to stop being heroes and start being professionals. Storing user data safely is easy until it&#8217;s not.</p>
<p>We are constantly fighting nature to meet our stated goals: we don&#8217;t want buildings to fall, disease to kill us, or private information to leak. For a little while, it&#8217;s okay to fail catastrophically and act surprised. But eventually, these failures are no longer surprising, they&#8217;re just negligent. That time of transition for software architects is now. Every company that dabbles in user data should assign a dedicated security and privacy team whose sole responsibility is to protect user data. We will not eliminate all failures, but we can do much, much better.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2011/04/28/your-information-wants-to-be-free/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>grab the pitchforks!&#8230; again</title>
		<link>http://benlog.com/articles/2011/04/19/grab-the-pitchforks-again/</link>
		<comments>http://benlog.com/articles/2011/04/19/grab-the-pitchforks-again/#comments</comments>
		<pubDate>Tue, 19 Apr 2011 17:49:30 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[crypto]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=1618</guid>
		<description><![CDATA[I&#8217;m fascinated with how quickly people have reached for the pitchforks recently when the slightest whiff of a privacy/security violation occurs. Last week, a few interesting security tidbits came to light regarding Dropbox, the increasingly popular cloud-based file storage and &#8230; <a href="http://benlog.com/articles/2011/04/19/grab-the-pitchforks-again/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m fascinated with how quickly people have reached for the pitchforks recently when the slightest whiff of a privacy/security violation occurs.</p>
<p>Last week, a few interesting security tidbits came to light regarding <a href="http://dropbox.com">Dropbox</a>, the increasingly popular cloud-based file storage and synchronization service. There&#8217;s some interesting discussion of de-duplication techniques which might lead to Oracle attacks, etc., but the most important issue is that, suddenly, everyone&#8217;s realizing that Dropbox <em>could</em>, if needed, access your files. Miguel de Icaza wonders if Dropbox is pitching <a href="http://tirania.org/blog/archive/2011/Apr-19.html">snake oil</a>.</p>
<p>Yes, Dropbox staff can, if needed, access your files. I don&#8217;t mean to harp on my fellow technologists but&#8230; this has been obvious since day 1, because Dropbox offers a web-based interface to download your files, and even with the latest HTML5 technology, you&#8217;d be very hard-pressed to do in-browser file decryption. Let&#8217;s say you still don&#8217;t buy that, you still think that Dropbox might find a way to encrypt files and decrypt them in your browser. Dropbox also offers a password recovery mechanism, which means they can fully simulate you, the user, including, of course, getting at your files.</p>
<p>In other words, unless you&#8217;re ready to lose the convenience of password resets and web-based UI, Dropbox inherently has access to your files. Just like Facebook has access to your entire account, and Google to all of your docs, spreadsheets, etc. The only question is what kinds of internal safeguards do these companies have to prevent abuse by employees. Unless you&#8217;ve worked there, it&#8217;s hard to know. You could ask Dropbox to do third-party auditing, like Miguel proposes, but in my experience that provides little real security, since you have little way to know what that third-party actually did as part of their auditing (was it just &#8220;logic and accuracy&#8221; testing?)</p>
<p>The other thing we could ask is for the law to finally recognize that my files stored on Dropbox are no different than my files stored on a hard drive in my basement, from a legal perspective. They&#8217;re my property. And accessing them should require the same level of judicial oversight as a warrant to my home. That&#8217;s what a group of young MIT techies (myself included) and Harvard lawyers <a href="http://groups.csail.mit.edu/mac/classes/6.805/student-papers/fall98-papers/trespass/final.html">proposed in 1998</a>.</p>
<p>But back to Dropbox. Did they do something wrong? Yes, they did. They exaggerated their security and privacy claims. <em>Just like almost every other cloud data host today</em>. I wish, instead of picking on whichever startup suddenly succeeds, we picked on the industry as a whole. Stop talking about encryption in transit and encryption at rest in the same breath, as if they were the same thing. Stop using &#8220;encryption&#8221; as a synonym for &#8220;secure.&#8221; Stop saying &#8220;military-grade security.&#8221; Start being honest about who can access what.</p>
<p>And we, technologists, should stop with the drama, and not fall prey to the inflated expectations that marketing-heavy security policies have set. The Dropbox weaknesses should have been obvious to technologists from day one. The problem is that <em>all</em> privacy policies and security statements make exaggerated claims using reassuring keywords. Let&#8217;s harp on that.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2011/04/19/grab-the-pitchforks-again/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>The Health IT report is very good; some opinionated suggestions</title>
		<link>http://benlog.com/articles/2010/12/08/the-health-it-report-is-very-good-some-opinionated-suggestions/</link>
		<comments>http://benlog.com/articles/2010/12/08/the-health-it-report-is-very-good-some-opinionated-suggestions/#comments</comments>
		<pubDate>Wed, 08 Dec 2010 20:24:13 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=1440</guid>
		<description><![CDATA[&#8220;Oy,&#8221; I thought, when I received a copy of &#8220;REPORT TO THE PRESIDENT REALIZING THE FULL POTENTIAL OF HEALTH INFORMATION TECHNOLOGY TO IMPROVE HEALTHCARE FOR AMERICANS: THE PATH FORWARD&#8221; [PDF]. I worried this would be a lot of vague, easy-to-agree-with &#8230; <a href="http://benlog.com/articles/2010/12/08/the-health-it-report-is-very-good-some-opinionated-suggestions/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>&#8220;Oy,&#8221; I thought, when I received a copy of &#8220;REPORT TO THE PRESIDENT REALIZING THE FULL POTENTIAL OF HEALTH INFORMATION TECHNOLOGY TO IMPROVE HEALTHCARE FOR AMERICANS: THE PATH FORWARD&#8221; [<a href="http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-health-it-report.pdf">PDF</a>]. I worried this would be a lot of vague, easy-to-agree-with advice with little actionable material. I was wrong. Hats off to the team that wrote this!</p>
<h4>Problem Analysis is right on</h4>
<p>Some nuggets of the problem analysis, all from the executive summary (a quick and useful read):</p>
<blockquote><p>
First, most current health IT systems are proprietary applications that are not easily adopted into the workflow of a clinician’s day, and whose proprietary data formats are not directly exchangeable from one system to another.
</p></blockquote>
<p>Yes! Proprietary systems and data formats are the number one problem, as I&#8217;ve <a href="http://benlog.com/articles/2010/02/18/taxing-human-transactions-%E2%80%93-part-1/">complained about before</a>.</p>
<blockquote><p>
Second, most healthcare organizations that utilize electronic health records (EHRs) view them as purely internal resources, and have little incentive for investment in secondary or external uses, such as making them accessible in appropriate form to patients, to a patient’s healthcare providers at other organizations, and in de­identified or aggregated form to public health agencies and researchers.
</p></blockquote>
<p>Yes! I&#8217;ve heard someone phrase this as &#8220;there&#8217;s no billing code for sharing data with a patient or another hospital.&#8221;</p>
<blockquote><p>
Third, legitimate patient concerns about privacy and security make patients uneasy about participating in health IT systems or granting consent for their information to be used in research. Fourth, health IT has historically been oriented toward administrative functions, not better care. This is in part because, under the current fee­-for­-service payment model, the economic benefits of investing in health IT can rarely be realized by the provider or organization that makes the investment.
</p></blockquote>
<p>OK, so the privacy-and-security point is a little bit vague, but the point about administrative functions vs. care is right on. Overall, this is one of the clearest explanation of the problems with Health IT today that I&#8217;ve seen.</p>
<h4>Patient Involvement: too timid</h4>
<blockquote><p>
2. [...] achievement of the President’s goals requires significantly accelerated progress toward the robust exchange of health information.
</p></blockquote>
<p>Effectively, the meaningful use rules are too modest in their interoperability demands. I agree. That said, the report doesn&#8217;t sufficiently emphasize the role patients could play. They certainly talk about giving patients their data, but not about how <em>giving patients their data is the natural path to secure health-data exchange between providers</em>. No one can argue that the patient doesn&#8217;t have the right to see their own data, and if the patient is the messenger, then consent is inherently part of the mix. Instead, for some reason, Health IT folks are obsessed with solving every problem on the backend, fuzzy matching master patient indexes, etc. I continue to believe that the patient (as in, you, me, every user of the healthcare system) is far better positioned to manage their health data than anyone else. Giving patients their data is not just an end-goal, it&#8217;s a means to accomplishing other health data exchanges.</p>
<p>So when the report says:</p>
<blockquote><p>
The best way to give clinicians a unified, patient­-centric record tailored for each medical encounter is to store, maintain, update, and exchange the data as small, distributed, metadata-­tagged elements.
</p></blockquote>
<p>I disagree. The solution isn&#8217;t a specific technology used in expressing the data, it&#8217;s about how/where the data flows. The best way to give clinicians a unified patient-centric record is to <em>give patients all of their data</em> and the means to share it easily with the doctors of their choice.</p>
<h4>Data Exchange and Interoperability: needs more work</h4>
<p>The report then spends a good bit of time talking about a <em>universal exchange language</em>, and an <em>infrastructure for locating and assembling [...] a patient&#8217;s record</em>. These are interesting points, but the devil is in the details.</p>
<p>On the universal exchange language: it&#8217;s a lot harder than it sounds. The report does mention extensibility as a key feature to allow for private industry to build on top of this universal language, and that&#8217;s a very good thing. The report also briefly says the language must be &#8220;open.&#8221; That&#8217;s very, very good, assuming we have the same definition of open: anyone can use it and redistribute it, without asking for permission from anyone. </p>
<p>The problem is that existing standards organizations in health IT believe existing solutions to be &#8220;extensible&#8221; and &#8220;open.&#8221; SNOMED is open! RxNorm is open! HL7 is open! CCR is open! No, they&#8217;re not. Most are free of cost, but they&#8217;re not free of obstacles: you still need to sign up to get a &#8220;free license,&#8221; which means any organization must check that every other organization it deals with has signed those many &#8220;free licenses&#8221; to those vocabularies before data can flow. We need truly free medical vocabularies and formats, and we don&#8217;t have them yet. I wish the report emphasized this need more, since only the Federal government can make this happen.</p>
<p>And then there&#8217;s the &#8220;extensible&#8221; issue. This is, in my opinion, the worst part of the report:</p>
<blockquote><p>
We believe that the natural syntax for such a universal exchange language will be some kind of exten­sible markup language (an XML variant, for example) capable of exchanging data from an unspecified number of (<em><b>not necessarily harmonized</b></em>) semantic realms.
</p></blockquote>
<p>The emphasis is mine. Not necessarily harmonized? Then what is the point? We already have dozens of syntaxes for expressing medical data, and they <em>all punt on semantic interoperability</em>. That is insanity. We need semantic interoperability in this universal exchange language, lest we once again define the envelope without ever standardizing what goes inside the envelope. This is, sadly, how most health IT standards have defined interoperability: you can put <em>anything</em> in the envelope, so it&#8217;s extensible, right? No. Think about the implementer: where do they start on parsing that completely unspecified payload?</p>
<p>Let&#8217;s make this universal exchange language truly open and truly extensible. For best practices on openness, we can look to Open Source and <a href="http://creativecommons.org">Creative Commons</a> for advice. For best practices on interoperability/extensibility, we can look to the <a href="http://w3.org">World Wide Web Consortium</a>&#8216;s standards on <a href="http://linkeddata.org/">Linked Open Data</a> and RDF. These problems have been solved before, let&#8217;s not recreate half-baked solutions for health IT. And in the name of all that is holy, enough with the payload-agnostic interoperability standards. Payload agnosticism is anti-interoperable.</p>
<h4>Privacy and Security: some good, some bad</h4>
<p>The report recommends data provenance and privacy preferences tied to the data. That is <em>a very good idea</em>. It&#8217;s hard to do in practice, though, so if we go down this path, let&#8217;s really invest the time and money needed to figure out how to do provenance and privacy-preferences right. This is a big endeavor.</p>
<p>The report also recommends giving patients more control over the flow of their data. This is great, but it&#8217;s almost impossible for the average patient to understand what the various flows mean and how they&#8217;re used. This is yet another reason for giving patients their data so they can choose who gets to see it directly, rather than having to make decisions about transitive trust: do you allow doctor A, who spoke to doctor B yesterday, to speak to doctor C? Good luck with that.</p>
<p>Then the report goes into far too much detail about security and cryptography. It talks about keys, and digital signatures, and shared keys, and never storing the key on the same machine as the encrypted data (really? It&#8217;s only going to come together in RAM? How&#8217;s that going to work for gigabytes of MRI data?) There&#8217;s even discussion of how Data Entity Access Services can run authorization checks before delivering the encryption keys, and other descriptions of crypto protocol details. This is bad. It&#8217;s overly prescriptive and thus cannot take into account recent innovation, such as secure access delegation via attribute-based encryption. Instead, the report should have focused on general principles and requirements, such as:</p>
<ul>
<li> data should be secure even if someone eavesdrop on the network</li>
<li> data should be secure even if someone obtains the hard drives from a decommissioned server</li>
<li> authorized physicians, as certified by the AMA, should have access to any record via a break-the-glass approach, as long as there is an audit trail</li>
</ul>
<p>Focus on requirements, and don&#8217;t mandate specific crypto concepts, as if all crypto innovation had stopped in the 80s. Let the technologists find/discover/invent the specific solution that meets the requirements.</p>
<h4>Conclusions</h4>
<p>I&#8217;m harsh on a few pieces of this report because, overall, I think it&#8217;s quite good. (And I thought this even before I realized, just now, that my colleague Ken Mandl was on the advisory committee for this report.) There are a few points that need more emphasis, however:</p>
<ol>
<li> the government should NOT be building a concrete infrastructure for health data exchange. We don&#8217;t know the best way to exchange data nationwide, and we shouldn&#8217;t have a central, expensive top-down effort to achieve this. The market can figure this out&#8230; with some help (see next points.)</li>
<p></p>
<li> the government can and should define a truly interoperable syntax and semantics for medical record exchange. This may involve one-time purchases of proprietary coding systems, but at the end of the day, the output should be a <em>truly free</em> set of codes for all major medical concepts, an abstract model for representing them, and one default syntax for serializing this data.</li>
<p></p>
<li> the government should mandate that any healthcare provider make available, to any patient that wants it, all of their data in digital form using the universal syntax and semantics just mentioned, using a standard, open protocol. This should include transfer of said data to any Personal Health Record provider that complies with proper privacy protections and with the standard protocols, data format, and data semantics. No more one-off deals between one hospital and one PHR.</li>
<p></p>
<li> the government should mandate that any healthcare provider be capable of receiving, from a compliant PHR, a patient&#8217;s record, using the standard protocol, data format, and data semantics.</li>
<p></p>
<li> the government should sponsor Personal Health Records for all medicare/medicaid patients, say $5/month. Providers of these PHRs should fit conformance criteria for data portability and exchange, but otherwise should provide a relatively complete PHR service that the government can subsidize. The choice of the specific PHR should remain the patient&#8217;s, not any large hospital&#8217;s or other organization&#8217;s. We need market forces at work improving the PHR space, but we can use medicare to kickstart this market.</li>
</ol>
<p>The standard protocol for exchanging data between two endpoints could be NHIN Direct / the Direct Project, assuming they continue to simplify their approach.</p>
<p>There is an opportunity for the government to significantly improve health IT, mostly by greasing the wheels of data exchange and interoperability. There are some tough decisions to make, however. We cannot punt on the medical data payload and semantics. And we must involve the patient at the crux of the architecture. Once the government has established fertile ground for innovation, by standardizing the data exchanges, privacy standards, and otherwise removing silly licensing obstacles, the market can do what it does best: find the set of optimal solutions within those principled constraints.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2010/12/08/the-health-it-report-is-very-good-some-opinionated-suggestions/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>devices, payload data, and why Kim is (in part) right.</title>
		<link>http://benlog.com/articles/2010/06/01/devices-payload-data-and-why-kim-is-in-part-right/</link>
		<comments>http://benlog.com/articles/2010/06/01/devices-payload-data-and-why-kim-is-in-part-right/#comments</comments>
		<pubDate>Wed, 02 Jun 2010 01:19:51 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[policy]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=1230</guid>
		<description><![CDATA[A few days ago, I wrote about privacy advocacy theater and lamented how some folks, including EPIC and Kim Cameron, are attacking Google in a needlessly harsh way for what was an accidental collection of data. Kim Cameron responded, and &#8230; <a href="http://benlog.com/articles/2010/06/01/devices-payload-data-and-why-kim-is-in-part-right/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A few days ago, I wrote about <a href="http://benlog.com/articles/2010/05/27/privacy-advocacy-theater/">privacy advocacy theater</a> and lamented how some folks, including EPIC and Kim Cameron, are attacking Google in a needlessly harsh way for what was an accidental collection of data. Kim Cameron <a href="http://www.identityblog.com/?p=1102">responded</a>, and he is right to point out that my argument, in the Google case, missed an important issue.</p>
<p>Kim points out that two issues got confused in the flurry of press activity: the <em>accidental</em> collection of <em>payload data</em>, i.e. the URLs and web content you browsed on unsecured wifi at the moment the Google Street View car was driving by, and the <em>intentional</em> collection of <em>device identifiers</em>, i.e. the network hardware identifiers and network names of public wifi access points. Kim thinks the network identifiers are inherently more problematic than the payload, because they last for quite a bit of time, while payload data, collected for a few randomly chosen milliseconds, are quite ephemeral and unlikely to be problematic.</p>
<p>Kim&#8217;s right on both points. Discussion of device identifiers, which I missed in my first post, is necessary, because the data collection, in this case, was intentional, and apparently <em>was not disclosed</em>, as documented in <a href="http://epic.org/2010/05/epic-urges-federal-communicati-1.html">EPIC&#8217;s letter to the FCC</a>. If Google is collecting public wifi data, they should <em>at least</em> disclose it. In their <a href="http://googleblog.blogspot.com/2010/05/wifi-data-collection-update.html">blog post on this topic</a>, Google does not clarify that issue.</p>
<p>So, Google, please tell us how long you&#8217;ve been collecting network identifiers, and how long you failed to disclose it. It may have been an oversight, but, given how much other data you&#8217;re collecting, it would really improve the public&#8217;s trust in you to be very precise here.</p>
<p>Now, two points:</p>
<ol>
<li> taking a second look at <a href="http://epic.org/2010/05/epic-urges-federal-communicati-1.html">EPIC&#8217;s letter</a> and Kim&#8217;s <a href="http://www.identityblog.com/?p=1100">original post</a>, it still seems to me that there&#8217;s some confusion of the device identifier and payload data issues: the uproar materialized <em>after</em> Google revealed they had mistakenly collected payload data, and EPIC&#8217;s letter and Kim&#8217;s original post seem to weave back and forth between both issues, never really mentioning intent. Is this because the payload data story is juicier in headlines, and so bundling the two issues helps make the more important point? Maybe, but still, I think we should be more precise and careful when we attack on privacy grounds.</li>
<p></p>
<li> I agree that device privacy can be a big deal, especially when many people are walking around with RFIDs in their passports, pants, and with bluetooth headsets. But, <em>in this particular case</em>, is it a problem? If Google really only did collect the SSIDs of <em>open, public networks</em> that effectively invite anyone to connect to them and thus discover network name and device identifier, is that a violation of privacy, or of the Laws of Identity? I&#8217;m having trouble seeing the harm or the questionable act. Once again, these are public/open wifi networks. For the most part, these are static access points. Given Google&#8217;s stated interests in providing geolocation services, it would be detrimental to them if they catalogued roving access points. So, what&#8217;s the worst-case scenario here? Is it that, when I move to a new apartment, Google will know?</li>
</ol>
<p>None of this excuses Google&#8217;s lack of disclosure. This was intentional data collection, it should be disclosed, period.</p>
<p>And it&#8217;s worth asking the questions that Kim asks, raising awareness of device privacy. I&#8217;m not sure I&#8217;m as worried as Kim is on this particular issue, but the questions are certainly legitimate.</p>
<p>So, in the end, the privacy advocacy theater is coming first and foremost from the EU privacy folks, who did get enraged about payload data more than anything else. There&#8217;s still some coming from EPIC and, to remain blunt, a little bit from Kim&#8217;s first post. But his second post brings up very legitimate questions, and Google should take some additional action here, at least to let us know what they were collecting, when, and whether they properly disclosed it.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2010/06/01/devices-payload-data-and-why-kim-is-in-part-right/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>distributed innovation</title>
		<link>http://benlog.com/articles/2010/04/21/distributed-innovation/</link>
		<comments>http://benlog.com/articles/2010/04/21/distributed-innovation/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 21:58:57 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=1189</guid>
		<description><![CDATA[A few years ago, a small group of folks (Mark Birbeck, Steven Pemberton, Ralph Swick, Shane McCarron, me, and more recently Ivan Herman, Manu Sporny, and a lot of great new folks) started with the simple idea that, if web &#8230; <a href="http://benlog.com/articles/2010/04/21/distributed-innovation/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A few years ago, a small group of folks (Mark Birbeck, Steven Pemberton, Ralph Swick, Shane McCarron, me, and more recently Ivan Herman, Manu Sporny, and a lot of great <a href="http://www.w3.org/2010/02/rdfa/#who">new folks</a>) started with the simple idea that, if web pages contained a bit of structured data in addition to their haphazard content, we could improve the Web a little bit. We could mark up titles, people&#8217;s contact information, geolocation data, copyright licensing information, etc. Tools could be built, including browser plugins and search engines, to help users extract this structured data and make sense of it.  There were others there before us, in particular the microformats effort. But we had, from the start, one major design difference: we felt strongly that anyone should be able to extend the core features without getting approval. The technology we came up with is <a href="http://rdfa.info">RDFa</a>. A few years later, Yahoo adopted it with SearchMonkey, so if you add bits of RDFa to your page, Yahoo search prominently displays those tidbits in its search results. A little bit later than that, Google adopted it with Rich Snippets, same story as Yahoo. And today, <a href="http://opengraphprotocol.org/">Facebook just adopted RDFa</a>, which will help it connect more precisely the items you share/like/annotate on the Web.</p>
<p>We weren&#8217;t the only folks proposing this kind of markup, and there remain healthy competing technologies. But because RDFa was architected with minimal centralization, anyone can create a vocabulary for it, anyone can use it <em>and extend it</em> without central approval, and that&#8217;s exactly what Yahoo, Google, and now Facebook did. They didn&#8217;t consult with the RDFa team. They didn&#8217;t have to. I consider that a great success: distributed innovation at work.</p>
<p>There will be work to do to reconcile the Yahoo, Google, and Facebook vocabularies. But that&#8217;s okay. RDFa lets you add as many vocabularies as you want, so you can easily combine the three vocabs for now to be maximally compatible. Over time, the tremendous power of the linked-data toolchain that forms the underpinning of RDFa will be brought to bear to progressively make the vocabularies compatible.</p>
<p>Exciting stuff for the structured-data Web!</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2010/04/21/distributed-innovation/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Taxing Human Transactions – Part 1</title>
		<link>http://benlog.com/articles/2010/02/18/taxing-human-transactions-%e2%80%93-part-1/</link>
		<comments>http://benlog.com/articles/2010/02/18/taxing-human-transactions-%e2%80%93-part-1/#comments</comments>
		<pubDate>Thu, 18 Feb 2010 19:53:27 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[policy]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=1090</guid>
		<description><![CDATA[The worst part of my job is dealing with the mess of document formats and coding systems in healthcare. The acronym soup is insane: HL7, CCD, CCR, CDA, Green CDA (which I just heard about from John Halamka&#8217;s blog but&#8230; &#8230; <a href="http://benlog.com/articles/2010/02/18/taxing-human-transactions-%e2%80%93-part-1/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The worst part of my job is dealing with the mess of document formats and coding systems in healthcare. The acronym soup is insane: HL7, CCD, CCR, CDA, Green CDA (which I just heard about from <a href="http://geekdoctor.blogspot.com/2010/02/introducing-green-cda.html">John Halamka&#8217;s blog</a> but&#8230; no link!), and that&#8217;s just the document formats. Then there are coding systems like LOINC, SNOMED, SNOMED-CT, UMLS, ICD9, ICD10, RxNorm, &#8230; Interestingly enough, the issue is not how many there are. The issue is how they&#8217;re licensed. Here&#8217;s a screenshot from <a href="http://www.hl7.org/">the HL7 website</a> that should tickle your funny bone:</p>
<p><a href="http://benlog.com/wp-content/uploads/2010/02/Screen-shot-2010-02-18-at-10.25.50-AM.png"><img src="http://benlog.com/wp-content/uploads/2010/02/Screen-shot-2010-02-18-at-10.25.50-AM.png" alt="" title="Screen shot 2010-02-18 at 10.25.50 AM" width="564" height="261" class="aligncenter size-full wp-image-1091" /></a></p>
<p>So, HL7 is <b>unlocking</b> the power of health information, and to do that they&#8217;re going to <b>sell</b> you a standard.</p>
<p>Meanwhile, the National Library of Medicine has toiled for years on the Unified Medical Language System (UMLS), which attempts to codify *everything* in medicine, from anatomy to viruses. It&#8217;s a pretty impressive piece of work. Conveniently, they provide a &#8220;meta-thesaurus&#8221; that maps other coding systems, like SNOMED, to UMLS. Brilliant! Awesome! Except&#8230; to use UMLS, you have to register. And you have to fill out a yearly survey. And you&#8217;re not allowed to redistribute the UMLS codes. Oh, and you have to sign a 10-page licensing agreement that explains how you can use UMLS, but you can only use SNOMED under these conditions, and this other coding system you can only use in these other conditions, and if you don&#8217;t have three lawyers and a few weeks on your hands, good luck answering this simple question: &#8220;can I use this in my open-source library and release it freely to the world?&#8221;</p>
<p>Imagine, for a second, if we had a similar situation without computers. Doctors would have to pay a fee to speak official medical terms when discussing your health. You would have to pay a fee to have those terms translated into plain English. Canon would have to pay a licensing fee before making fax machines able to send medical documents from one doctor to another. In short, every time a health transaction occurs using standardized language, there would be a tax.</p>
<p>This is insane. Folks in the health IT world are focused on much harder problems while ignoring this blatant ball-and-chain on innovation.</p>
<p>I submit that the quickest path to health-IT reform is the complete and unconditional freeing of these medical vocabularies and data formats. And I mean <b>complete</b>. No access fees, no yearly surveys, no constraint on redistribution, country of origin, commercial or non-commercial. Free. like HTTP and HTML. Like English. Like a patient-doctor conversation. </p>
<p>Take a precise example: my group at Children&#8217;s Hospital Boston just released <a href="http://indivohealth.org">Indivo X</a>, the latest version of our Personally Controlled Health Record. It&#8217;s great, but there&#8217;s one key feature we had to strip out before shipping this free, open-source tool built using federal grant money: SNOMED codes. Sure, we&#8217;re a hospital with a license, we can use them internally. But we can&#8217;t redistribute them. So now, to install Indivo, instead of a 30-minute process, you need to go get a UMLS ID, wait 3 days for approval, then download the files, extract the codes we think are useful, and load them into the database. No exaggeration, you&#8217;ve now multiplied your time-to-working-install by 100. </p>
<p>This must change. Either the existing formats must be opened up, or new formats must emerge that do to the existing formats what HTTP and HTML did to Gopher: kill them with freedom. Taxing human interactions, simply because they&#8217;ve been digitized, is an unacceptable brake on innovation, and in a complex field like Health IT, it&#8217;s the last thing we need and the first thing we need to eliminate.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2010/02/18/taxing-human-transactions-%e2%80%93-part-1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Apple fanboy delusions, the Palm Pre is looking mighty tasty</title>
		<link>http://benlog.com/articles/2009/10/07/apple-fanboy-delusions-the-palm-pre-is-looking-mighty-tasty/</link>
		<comments>http://benlog.com/articles/2009/10/07/apple-fanboy-delusions-the-palm-pre-is-looking-mighty-tasty/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 22:58:27 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[policy]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=871</guid>
		<description><![CDATA[On many issues, I&#8217;m an Apple fanboy. On the issue of the iPhone, less and less. Here&#8217;s the short version of the story: Apple produces iTunes, which manages all of your music and videos, and syncs them to your iPod/iPhone. &#8230; <a href="http://benlog.com/articles/2009/10/07/apple-fanboy-delusions-the-palm-pre-is-looking-mighty-tasty/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>On many issues, I&#8217;m an Apple fanboy. On the issue of the iPhone, less and less.</p>
<p>Here&#8217;s the short version of the story: Apple produces iTunes, which manages all of your music and videos, and syncs them to your iPod/iPhone. Very cool software, magnificently built, great experience overall. I&#8217;ve been using this setup for 6+ years.</p>
<p>Along comes Palm with the Pre, a phone with functionality similar to the iPhone. Obviously, Palm wants to let its users sync their music and photo library with the Pre. Seems fair, right? Here&#8217;s how the story unfolds:</p>
<p><em>iTunes 8.0</em>: I will only sync with devices called &#8216;iPod&#8217;<br />
<em>Pre 1.0</em>: Hmmm, ok I&#8217;m an iPod! sold by Palm and manufactured by Palm<br />
<em>iTunes 8.1</em>: Oh I see how you want to play it, I will only sync with devices sold by Apple<br />
<em>Pre 1.1</em>: OK, then I&#8217;m an iPod sold by Apple, but manufactured by Palm<br />
<em>iTunes 9.0</em>: dammit, two can play that game, I will only sync with devices manufactured by Apple!<br />
<em>Pre 1.2</em>: OK, I&#8217;m an iPod, sold by Apple and manufactured by Apple</p>
<p>Now, some Apple fanboys, including <a href="http://daringfireball.net/linked/2009/10/04/hunter-webos-itunes">John Gruber</a> and <a href="http://hunter.pairsite.com/blogs/20091004/">Craig Hunter</a> are calling Palm out:</p>
<blockquote><p>
Whatever hype and capital Palm built up around the launch of the Pre has been squandered on a pointless and trivial cat and mouse game with Apple over iTunes sync. The saddest part is that this was totally unnecessary, though Palm wants you to think otherwise.</p>
<p>You see, Palm doesn&#8217;t need the iTunes app to sync the Pre. They don&#8217;t need to draw Apple&#8217;s ire, or play yo-yo with their customers over this important capability. They can sync the Pre to a customer&#8217;s iTunes music library with a public, open, and documented approach that has been used by third-party developers and device makers for years. This capability was created by none other than Apple itself.
</p></blockquote>
<p>Funny that Palm is the only one blamed in this cat-and-mouse game. Why is okay for Apple to <em>purposely reduce user functionality</em> for no other reason than to stick it to Palm? At least Palm is trying to provide features to its users!</p>
<p>Also, if you&#8217;re going to start picking apart Palm&#8217;s design, then maybe it&#8217;s time to send the coding police after Apple, too: why not sync with any device that offers itself up as meeting the iPod API? Why force Blackberry and others to build their own sync apps? Maybe it&#8217;s okay for Apple not to go out of its way to help Palm, but then why actively spend resources shutting them out repeatedly? Do Gruber and Hunter <em>actually</em> believe that this is meant to protect users?</p>
<p>I&#8217;m 99% sure I know what&#8217;s going on. And Gruber and Hunter probably do, too. Blackberry syncing works great for non-DRM&#8217;ed songs, but <a href="http://na.blackberry.com/eng/services/media/mediasync.jsp">read the fine print</a>:</p>
<blockquote><p>
Certain music files may not be supported by the media player, including incompatible file types and files that contain digital rights management technologies.
</p></blockquote>
<p>There is no way for the Palm Pre to provide the full feature set that its users are entitled to without acting like an iPod, because <em>Apple specifically built their system to prevent synchronization of DRM&#8217;ed media with devices other than the iPod</em>. Once again, the legitimate user who bought his songs, bought his TV shows, and bought his Palm Pre, gets screwed.</p>
<p>I&#8217;m an Apple fanboy on many issues, but that&#8217;s not okay. And Gruber and Hunter should know better than to chastise the party that&#8217;s trying to do right by its users.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2009/10/07/apple-fanboy-delusions-the-palm-pre-is-looking-mighty-tasty/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stefano thinks I&#8217;m a purist&#8230;</title>
		<link>http://benlog.com/articles/2009/09/25/stefano-thinks-im-a-purist/</link>
		<comments>http://benlog.com/articles/2009/09/25/stefano-thinks-im-a-purist/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 18:16:17 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=853</guid>
		<description><![CDATA[Stefano Mazzocchi is awesome and his thinking on Web-based data is incredibly nuanced and pragmatic, so it&#8217;s not often that I want to publicly disagree with him. But in his latest post, I think he&#8217;s off the mark. Stefano argues: &#8230; <a href="http://benlog.com/articles/2009/09/25/stefano-thinks-im-a-purist/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.betaversion.org/~stefano/">Stefano Mazzocchi</a> is awesome and his thinking on Web-based data is incredibly nuanced and pragmatic, so it&#8217;s not often that I want to publicly disagree with him. But in his <a href="http://www.betaversion.org/~stefano/linotype/news/325/">latest post</a>, I think he&#8217;s off the mark.</p>
<p>Stefano argues:</p>
<blockquote><p>
The difference between RDFa and Microdata (syntactic differences aside) is basically the fact that the proponents of the first  believe that once everybody naturally starts reusing existing ID schemes and ontologies a densely connected web of semantically reconciled information will come together naturally. The second just want to focus on immediate values and avoid speculating on what’s going to happen next.
</p></blockquote>
<p>In the same vein, he adds:</p>
<blockquote><p>
The RDFa camp see it as a vector to promote the growth of the web of data, while the Microdata camp focuses on solving practical problems of embedding richer machine-processable information in web pages
</p></blockquote>
<p>That&#8217;s not true. <a href="http://rdfa.info">RDFa</a> is 100% focused on solving practical problems, e.g. for <a href="http://creativecommons.org">Creative Commons</a> search: Google and Yahoo now support Creative-Commons image search based on RDFa (check out <a href="http://www.youtube.com/watch?v=quyhasVn2jw">the Google video on how to add RDFa to your images</a>). RDFa builds on existing technology, RDF, not because it&#8217;s the gospel but because it has some nice properties: you can reuse someone&#8217;s vocabulary <em>if you want</em>, or you can invent your own if you prefer. If you invent your own, there&#8217;s a little bit of overhead to prevent stepping on toes and to enable others to reuse your vocabulary <em>if they choose</em>. We don&#8217;t expect everyone to reuse all the time. We expect duplicate vocabularies to arise. But we do think it&#8217;s a good idea to make reuse possible, easy and scalable to the Web.</p>
<p>We absolutely <em>do not</em> expect a &#8220;densely connected web of semantically reconciled information&#8221; to &#8220;come together naturally.&#8221; But we <em>do</em> think that, <em>when users want to build more densely connected semantically reconciled graphs</em>, they should be able to.</p>
<p>This isn&#8217;t just a theory, it&#8217;s actually happening: Google is <a href="http://googlewebmastercentral.blogspot.com/2009/09/supporting-facebook-share-and-rdfa-for.html">reusing Yahoo&#8217;s RDFa vocabulary intermixed with other vocabularies</a>. They didn&#8217;t have to. Other groups at Google are making up their own vocabularies. And that&#8217;s okay: both approaches are part of a healthy Web-data ecosystem.</p>
<p>So, are we speculating too much on what&#8217;s going to happen next? I don&#8217;t think so. In fact, I think it&#8217;s quite the opposite: RDFa is giving users a choice, while other technologies are purposefully reducing choice. I call it <em><b>overly opinionated software</b></em>: being so certain that even slight future-proofing is pointless that you actually make it deliberately harder for your users. The criticism that Stefano offers actually applies the other way: solutions that de-emphasize Web-scale identifiers are <em>reducing options</em> by deciding that <em>there shall be no meaningful, scalable reuse or distributed innovation</em>. With RDFa, you have a choice. With a number of other technologies, you can&#8217;t choose to reuse / mash-up easily.</p>
<h4>You said something about reconciliation</h4>
<p>Now, one of many areas where I&#8217;ve learned quite a bit from Stefano, one where everyone in the Linked Data community should stop and listen, is reconciliation. RDF promises reconciliation, where one day Google and Yahoo will realize that <tt>google:author</tt> and <tt>yahoo:creator</tt> are actually the same thing, they&#8217;ll come together around a campfire somewhere between Mountain View and Santa Clara and sing Kumbaya by mapping each URL to the other. And Stefano is absolutely right to point out that this won&#8217;t be trivial when the data is not identically sampled, when strings contain much embedded structure that hasn&#8217;t been normalized. He summarizes this as:</p>
<blockquote><p>
I find it frankly disheartening that purists still believe that the secret to a useful web of data is already there in the guts of the architecture of the web and that by simply turning a URI into a URL will cause enough social pressure to solve the other issues.
</p></blockquote>
<p>I agree. The Linked Data community shouldn&#8217;t overpromise what same-as mappings will accomplish. </p>
<p>But take a step back: what is the other option? Having even <em>less</em> information about the vocabularies we use? <em>Not</em> having the ability to map vocabulary terms to one another? Once again, it&#8217;s an issue of choice: RDFa and RDF give you the ability to map concepts to one another. You don&#8217;t have to. You can ignore those features if you want. But isn&#8217;t it still a good idea to let users map concepts when they choose and when they can?</p>
<p>RDF and RDFa shouldn&#8217;t overpromise what reconciliation can deliver. At the same time, critics shouldn&#8217;t use the argument that, because RDF doesn&#8217;t solve <em>all</em> problems, then it solves none. Especially when the alternative solutions provide zero functionality in that department.</p>
<h4>Who you callin&#8217; Purist?</h4>
<p>So here&#8217;s my take. The folks preventing reuse are the ones over-speculating. They are placing deliberate obstacles to vocabulary reuse, not because they are being cautious about the future, but because they think they know the future exactly, and they think that future doesn&#8217;t include vocabulary reuse. Meanwhile, contrary to popular belief, there are no RDFa cops watching over your shoulder, smacking you upside the head when you reinvent a term that FOAF or Dublin Core already specified. But there are RDFa genies, waiting to be invoked to help you reuse and augment FOAF and Dublin Core, if you want to.</p>
<p>With this added context, who&#8217;s the purist, really?</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2009/09/25/stefano-thinks-im-a-purist/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pot, Kettle, meet Zuckerberg</title>
		<link>http://benlog.com/articles/2009/06/23/pot-kettle-meet-zuckerberg/</link>
		<comments>http://benlog.com/articles/2009/06/23/pot-kettle-meet-zuckerberg/#comments</comments>
		<pubDate>Tue, 23 Jun 2009 23:35:47 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=639</guid>
		<description><![CDATA[Facebook is an impressive company, they&#8217;ve done and continue to do some very amazing things. And I admit I certainly didn&#8217;t see them coming 4 years ago. But okay, come on: &#8220;No one wants to live in a surveillance society,&#8221; &#8230; <a href="http://benlog.com/articles/2009/06/23/pot-kettle-meet-zuckerberg/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Facebook is an impressive company, they&#8217;ve done and continue to do some very amazing things. And I admit I certainly didn&#8217;t see them coming 4 years ago. But okay, <a href="http://www.wired.com/techbiz/it/magazine/17-07/ff_facebookwall">come on</a>:</p>
<blockquote><p>
&#8220;No one wants to live in a surveillance society,&#8221; Zuckerberg adds, &#8220;which, if you take that to its extreme, could be where Google is going.&#8221;
</p></blockquote>
<p>Umm, seriously? I mean, sure, Google might be pushing us towards a surveillance society, but then, isn&#8217;t Facebook doing exactly the same thing? At least Google promises to remove your records after a certain period of time, whereas <a href="http://benlog.com/articles/2009/02/16/facebook-were-keeping-your-data-for-your-friends-sake/">Facebook wants to keep your data forever for your friends&#8217; sake</a>. Interestingly, Zuckerberg repeatedly depicts other companies as potential evil entities, while Facebook is just the air your breathe, it could never be evil. So when the article points out that:</p>
<blockquote><p>
[Zuckerberg] has described Facebook as a once-in-a-century communications revolution, implying that he is right up there with Gutenberg and Marconi.
</p></blockquote>
<p>You have to stop and wonder&#8230; what if Gutenberg had said &#8220;here&#8217;s the printing press, but books can only be printed by me.&#8221; Or if Marconi had said &#8220;here&#8217;s the radio, all transmissions go through my central station, and I will relay them.&#8221;</p>
<p>Facebook is impressive, but in their model, <b>all data goes through them</b>. At least with Google, though they may keep my data for a while, I can switch search engines at any time. Yahoo is pushing the envelope every day. And Bing isn&#8217;t bad. But on Facebook, I can only do what Facebook is willing to let me do. So, if Google is a surveillance society, then Facebook is a surveillance society with a shock collar.</p>
<p><b>UPDATE</b>: my metaphor at the end was slightly incomplete, so I tweaked it.</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2009/06/23/pot-kettle-meet-zuckerberg/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open Licensing in Health IT</title>
		<link>http://benlog.com/articles/2009/06/23/open-licensing-in-health-it/</link>
		<comments>http://benlog.com/articles/2009/06/23/open-licensing-in-health-it/#comments</comments>
		<pubDate>Tue, 23 Jun 2009 15:57:29 +0000</pubDate>
		<dc:creator>ben</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[policy]]></category>

		<guid isPermaLink="false">http://benlog.com/?p=633</guid>
		<description><![CDATA[John Halamka, renowned CIO of the Beth Israel Deaconess Medical Center (BIDMC), is a blogger, and he just added a Creative Commons license after making the following remarks: I want my blog to be used for education, training, and research. &#8230; <a href="http://benlog.com/articles/2009/06/23/open-licensing-in-health-it/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>John Halamka, renowned CIO of the Beth Israel Deaconess Medical Center (BIDMC), is a <a href="http://geekdoctor.blogspot.com/">blogger</a>, and he just <a href="http://geekdoctor.blogspot.com/2009/06/copyleft-all-rights-reversed.html">added a Creative Commons license</a> after making the following remarks:</p>
<blockquote><p>
I want my blog to be used for education, training, and research. I hope that its contents appear in derivative works such as other blogs, websites, and wikis. I&#8217;d prefer that these derivative works be openly shared.</p>
<p>I would also ask that any material that is repurposed has attribution to me as the author.</p>
<p>Content from my blog should not be sold. Charging for access to that which I make freely available seems wrong.</p>
<p>How do I express these preferences legally?
</p></blockquote>
<p>Exactly! Now, if only health IT interchange specifications followed the same path. For example, <a href="http://hitsp.org">HITSP</a>, which aims to &#8220;enable healthcare interoperability&#8221;, has the following copyright statement:</p>
<blockquote><p>
No portion of the ANSI Sites may be reproduced in any form, electronic or otherwise, for any purpose other than personal use, without prior written permission of ANSI. To the extent that ANSI is not the copyright owner of some portion of an ANSI Site, ANSI has received permission to include such material in such ANSI Site.
</p></blockquote>
<p>The individual specifications <em>might</em> be a bit more lenient, but it&#8217;s not clear because they refer to other specifications&#8217; individual copyright licensing terms, so you have to follow your nose to every sub-specification and figure out how to reconcile the terms from these disparate sources. Yeah.</p>
<p>Meanwhile, you have to <a href="http://www.astm.org/Standards/E2369.htm">pay to even see the Continuity of Care Record (CCR) standard</a>. And <a href="http://www.HL7.com.au/FAQ.htm#Licensing">HL7 licensing</a> handwaves about how &#8220;strict copyright&#8221; is the only way they can maintain the integrity of their standard (untrue and likely ineffective), with a relatively amusing comparison of their per-download fee to &#8220;Apache and Linux distributions&#8221;, even though of course you can download Linux and Apache at no cost whatsoever, <em>and</em> you can redistribute them if you wish.</p>
<p>Just this week, there&#8217;s a new effort to <a href="http://www.boston.com/news/nation/washington/articles/2009/06/23/health_data_rights_declaration_gets_push/">give individuals control over their health data</a>. I think it&#8217;s a great effort, but one of the necessary conditions to get there is to have truly open Health IT standards. No usage fees, no download fees, open licensing that enables others to innovate on top of the standards for novel medical applications, and probably a trademark approach to encouraging interoperability (i.e. you can&#8217;t call it &#8220;HL7&#8243; unless you pass the HL7 test suite). There will not be widespread patient-controlled flow of health data until there are truly open Health IT standards.</p>
<p>John, if you&#8217;re listening, let&#8217;s bring that Creative Commons attitude to the Health Standards groups ASAP!</p>
]]></content:encoded>
			<wfw:commentRss>http://benlog.com/articles/2009/06/23/open-licensing-in-health-it/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

