Jason Hare's Data Management Blog

Posts

Don’t Just Make Data Open, Make Open Data Useful!

- December 14, 2014

By Dennis D. McDonald, Ph.D. Introduction Earlier this week in USAID’s Evolving Open Data Culture I applauded that U.S. government agency’s efforts to make its open data useful. In this post a dive a little deeper into the topic of “open data usefulness.” Background I came to my interest in open data by way of a career that has mostly involved technology-related projects and consulting. Major goals have been to make or support products or services that are useful to somebody. For private sector clients this usually involved impacting cost or revenue targets. For government agencies or nonprofits work has focused on objectives that have both quantitative and qualitative aspects. In either case, “usefulness” has meant that the actions taken by people as a result of using the product or service are viewed by them in a beneficial or positive light because it helps them accomplish their objectives. As a result of this perspective I’ve thought it would be shortsighted or incomplete not to

OPEN DATA PORTALS SHOULD BE API [FIRST]

- December 12, 2014

As you can see from the figure, September and November are not far below October. Some readers may wonder why there is a surge in API calls starting in May of 2014. May through October was spent building open source service architectures on Red Hat JBOSS Switch Yard that could mine and automatically append data sets within the Open Raleigh Portal. Open Raleigh uses a responsive web design that is friendly to most handheld devices but the API needs a little help to push data into the portal. The portal itself releases every data set as an API endpoint. This API is a READ only API. Writing some code we can have the Socrata portal allow us to append data sets. Socrata is not alone in the Web/Mobile [First] category. ESRI, CKAN and to some extent, Junar are architected on the same principals. This is not a direct criticism or endorsement of any particular platform. THE CONSEQUENCES OF GETTING IT WRONG Discussing multi-nodal approaches and espousing an API [First] st

Morrisville Councilman Steve Rao: Town Hall Meeting

- December 08, 2014

Councilman Steve Rao and Family Today, Monday, December 8th at 4:00 PM Morrisville Councilman Steve Rao is hosting his sixth Virtual Town Hall of the year. These innovative events have been used by Councilman Rao throughout the year to keep in touch with citizens in Morrisville and throughout North Carolina and to provide updates and take questions on a variety of topics. The last Virtual Town Hall was on the potential for Open Data to revolutionize government and featured special guest Ian Henshaw of the Open Data Institute. It was attended by people throughout North Carolina and also from the United Kingdom and from India. Today’s event will review what has happened in the Triangle area in the past year and look forward to 2015. These are part of a larger initiative by Councilman Rao to spur innovation in the state by applying the latest technologies to solve our public policy problems. To attend, you must RSVP at the link below and have a computer equipped to log into a Google Han

The Three Phases of Open Data Quality Control

- October 08, 2014

By Dennis D. McDonald, Ph.D., dmcdonald@balefireglobal.com Introduction In my previous post about open data quality the suggested solutions relate not just to adhering to standards but also to making sure that the processes by which open data are published and maintained are efficiently and effectively managed. In this post I drill down a bit more on that point about the management processes. Three Phases When discussing open data it helps to look at open data projects with tasks divided into at least three related phases: Assessment and planning Data preparation and publishing Ongoing maintenance and support Different tools and processes are relevant to each phase. Each can have an impact on the quality of the data as well as its perceived quality. Phase 1. Assessment and planning Critical to data quality at this first phase of an open data project is an understanding of the "who, where, how, how much, and why" of the data. If the goals of the project include making data f

How Important Is Open Data Quality?

- September 26, 2014

By Dennis D. McDonald, Ph.D. Email: dmcdonald@balefireglobal.com At risk? Martin Doyle's Is Open Data at Risk from Poor Data Quality is a thoughtful piece but doesn’t address this question: Should data quality standards observed in open data programs be any different from the data quality standards observed in any other programs that produce or consume data? My first response is to answer with a definite “No!” but I think the question is worth discussing. Data quality is a complex issue that people have been wrestling with for a long time. I remember way back in graduate school doing a class project on measuring “error rates” in how metadata were assigned to technical documentation that originated from multiple sources. Just defining what we meant by “error” was an intellectually challenging exercise that introduced me to the complexities of defining quality as well as the impacts quality variations can have on information system cost and performance. Reading Doyle’s article rem

ODI DATA CERTIFICATES ARE A BIG DEAL

- September 17, 2014

This morning something happened that will gradually impact the way we interact with data. This morning OpenDataSoft (ODS) embedded the Open Data Institute's Data Set Certificates into each and every data set page. Will other open data platforms follow? Maybe. Embedding certificates is something I have been advocating since the idea was just an idea at the Open Data Institute (ODI). No one until ODS ever took me up on my offer. These data certificates show a willingness on the part of the data steward to consider the following: The impact on individual privacy API and format documentation to ensure a greater chance of data re-use Metadata on where the data originates and how often it is refreshed RDF description tags and identifiers that allow f

How Cost Impacts Open Data Program Planning - and Vice Versa

- September 05, 2014

By Dennis D. McDonald, Ph.D. dmcdonald@balefireglobal.com Introduction How important are costs when you are planning an open data program? Are they, as suggested by Rebecca Merrett in Addressing cost and privacy issues with open data in government , the "… elephant in the room," especially when data anonymization costs are being considered? Or are such costs just a normal consideration when planning any project where quantities of different types of data have to be manipulated and delivered? It's difficult to make a generalization about this. Open data program costs can fall along at least three general dimensions: 1. Controlled versus uncontrolled 2. Known versus unknown 3. Startup versus ongoing 1. Controlled versus uncontrolled Why worry about what you can’t control? The answer is because they can impact your program whether you control them or not. Examples of uncontrolled costs might be: Taxes, licensing, insurance, and registration fees. Staff salaries that can'