Posts

Open Data in the US: Navigating the Privacy Minefield

Image
  Open Data and Privacy Policies:  North Carolina How can we be rigorous and thorough and at the same time protect citizen privacy? This is not about Raleigh's open data portal currently under construction. This is about the fractured and ad hoc privacy policies of open data initiatives in the United States. Privacy is complicated. In my conversations with policy folks at the Open Knowledge Foundation several topics around privacy came up. While none of them really solve the privacy issue for open data efforts in the US perhaps there can be work around. We as open data advocates in the US can begin to develop a set of principles. We can police our initiatives in the hope that privacy will one day catch up. Open data initiatives in the United States are not, for the most part, governed by federal law. In a few cases, federal law does have remedies for certain kinds of data (think HIPAA ). Most law affecting privacy happens at the state level which makes a coherent national model of

When Open Data gets Personal

Image
  Patrick and motorcycles. We had good time riding bikes together. Him on his trusty old Honda and me on my trusty old BMW. New Memorial Fund Site Setup Giveforward has been designated as the foudation site for The Partick Arbogast Memorial Scholarship Fund. So an update on a special and personal type of #opendata website dedicated to the Patrick Arbogast Scholarship Fund: I posted this originally to my Facebook account which is strictly for my non-professional contacts. I am not sure how many of my friends are on Google+ and not on Facebook so excuse the cross-post. About Patrick Arbogast  He was not in the media and he was not facing legal problems. He was facing his own internal issues that he was not quick to talk about. Patrick Arbogast was my friend for over 30 years and a rising star in the biostatistics world with an emphasis on holistic medicine. He left his academic job (research professor) at Vanderbilt to take a job as a senior researcher at Kaiser Permanente before takin

The Future of the Open Data Catalog

Image
  This is a wifreframe from the World Bank showing a search interface for their highly respected data catalog. Searchable Data Sets? Check out the #wireframe for a fully searchable data catalog. This is where #opendata needs to head. Citizens do not browse data like kids in a candy store. I have asked about engagement and watched them interact with open data sets and not one non-data geek ever found them interesting. For methodology I chose sets of sets of 3-5 people and just asked them questions and sent them links. These people came from my Facebook groups of friends and do not have a professional connection with me. When I asked them about their browsing patters in general most people used Google Search to find specific information or started with a list of 10 or less "go to" sites. Data sets need to be searchable and queries need to come from Google and index both the data sets themselves and be able to interpret natural language queries before will be mainstream and of i

Optically scanning PDFs into Open Data

Optically scanning PDF data and turning it into machine readable format data would be quite an accomplishment. This is happening right around the corner from the City of Raleigh using minutes from our council meetings. I would like to see what the CSV files look like. This could be a large impact on open data in reusing PDFs and perhaps other proprietary formatted data types. I am only now starting to imagine all of the potential uses for something like this. Turning scanned documents into structured data at reporterslab.org A new open-source program under development at Raleigh Public Record aims to pull structured data from scanned-in public records. And ahead of its release at the 2013 Computer-Assisted Reporting Conference, developers are... See the full article here.

Open Data, Open Government and the Guise of Transparency

Image
  Technology and Transparency: Data Needs a Cultural Context Per a conversation with Alex Howard on Google+, I have read " Beyond Technology for Transparency ". The author makes several points: The Yu and Robinson article " The New Ambiguity of 'Open Government '" was something I read a few weeks ago. I disagree that the edge of open government is going away. Yes there are data sets being published that have nothing to do with accountability. Yes there are open data initiatives that stand up a few data sets and call it "open". This does not mean that all or most open data professionals do this. This whole line of "government versus the people" is one of the reasons PSA's have trouble getting open data initiatives launched in the first place. David Sasaki makes good points at the end of his blog post. Open data should strive to reduce poverty, corruption and and reduce operational inefficiencies. This will not happen with just transpar

Open Data Use and Why it Matters

Image
  Screenshot of the Gun Map Programmers explain how to turn data into journalism & why that matters | Poynter. Transparency does have limits. That is a policy issue that a PSA must work through. In my particular open data initiative, we have laid out several principles regarding security and privacy. I brought this up on a Linkedin post a few weeks ago and this was part of my question to +Alex Howard about #opendata and #journalism. The now infamous " gun map " that identified by name registered gun owners is a good example of how not to use open data. This, I know, was not a PSA that published this data. It was a newspaper. However, I feel open data and big data evangelists need to start having a discussion about the right to privacy and our ethical boundaries. The gun map, as an example, would have been just as effective as a heat up of gun ownership density. The article "Programmers explain how to turn data into journalism & why that matters" by Jeff Sond

Big Data still means Good Science

"There is nothing new under the sun". Nate Silver points to this quote as being indicative of man's lack of progress until the industrial revolution. In the vernacular it is meant that the more things change the more they remain the same. Big data, open data and the new vast piles of noise we sit on mean nothing without analysis. Acknowledging that all researchers have bias and there is no such thing as unbiased information or knowledge is a good thing. Big Data is still data which we use to support hypotheses and against which we measure and test theories. No type of big data, open data or government open data is a silver bullet. Data has to be transformed into information and knowledge through contextual inquiry. When, on a summer Sunday morning in 1987, three hundred thousand people crammed onto the central span of San Francisco’s Golden Gate Bridge, they came perilously close to participating in...read more below Are We All Being Fooled by Big Data?