Posts

Showing posts from September, 2014

How Important Is Open Data Quality?

Image
By Dennis D. McDonald, Ph.D. Email: dmcdonald@balefireglobal.com At risk? Martin Doyle's Is Open Data at Risk from Poor Data Quality is a thoughtful piece but doesn’t address this question: Should data quality standards observed in open data programs be any different from the data quality standards observed in any other programs that produce or consume data? My first response is to answer with a definite “No!” but I think the question is worth discussing. Data quality is a complex issue that people have been wrestling with for a long time. I remember way back in graduate school doing a class project on measuring “error rates” in how metadata were assigned to technical documentation that originated from multiple sources. Just defining what we meant by “error” was an intellectually challenging exercise that introduced me to the complexities of defining quality as well as the impacts quality variations can have on information system cost and performance. Reading Doyle’s article rem

ODI DATA CERTIFICATES ARE A BIG DEAL

Image
This morning something happened that will gradually impact the way we interact with data. This morning OpenDataSoft (ODS) embedded the Open Data Institute's Data Set Certificates into each and every data set page. Will other open data platforms follow? Maybe. Embedding certificates is something I have been advocating since the idea was just an idea at the Open Data Institute (ODI). No one until ODS ever took me up on my offer. These data certificates show a willingness on the part of the data steward to consider the following: The impact on individual privacy API and format documentation to ensure a greater chance of data re-use Metadata on where the data originates and how often it is refreshed RDF description tags and identifiers that allow f

How Cost Impacts Open Data Program Planning - and Vice Versa

Image
By Dennis D. McDonald, Ph.D. dmcdonald@balefireglobal.com Introduction How important are costs when you are planning an open data program? Are they, as suggested by Rebecca Merrett in Addressing cost and privacy issues with open data in government , the "… elephant in the room," especially when data anonymization costs are being considered? Or are such costs just a normal consideration when planning any project where quantities of different types of data have to be manipulated and delivered? It's difficult to make a generalization about this. Open data program costs can fall along at least three general dimensions: 1. Controlled versus uncontrolled 2. Known versus unknown 3. Startup versus ongoing 1. Controlled versus uncontrolled Why worry about what you can’t control? The answer is because they can impact your program whether you control them or not. Examples of uncontrolled costs might be: Taxes, licensing, insurance, and registration fees. Staff salaries that can'

Three Things about Open Data Programs That Make Them Special

Image
By Dennis D. McDonald, Ph.D., Balefire Global, dmcdonald@balefireglobal.com During the brainstorming session at the inaugural meeting of the Open Data Enthusiasts meet up last week in Washington DC, attendee David Luria commented that we need to do a better job of understanding, defining, and communicating the objectives of open data programs if we want them to be successful. I couldn't agree more. Program objectives need to be clearly defined and shared with stakeholders and program participants so that everyone is marching in the same direction. If we don't understand and agree on our objectives how can we establish requirements and metrics to measure what we're trying to accomplish? Admittedly the above principle is straight out of Project Management 101 and describes the initial steps you need to take in planning and documenting any project, not just those involving open data. Still, what I have noticed after involvement with many data related projects is that there a