The Future of Open Data


The Future is Here

You are the future of open data. Last year I was the keynote speaker of the first open data day in Cary. Last year we had a third of the audience here today. Now Triangle Open Data Day does not need open data personalities. This event happened because city and county leaders came together with you. You represent the thinkers, the doers, the “why not?” hackers and storytellers that will take data unlocked by people like me. You get to decide the future of open data. You will get to create the products and collaborate with government in delivering value through open data. But more of this story will be presented later. I would like to introduce Gavin Starks, CEO of the Open Data Institute. Sir Tim Berners Lee is the President of the Open Data Institute and the World Wide Web Consortium as well as the first thought leader to imagine open linked data on the web back in 1989. Listen to Gavin talk about the “what’s next” in open data.

How We Got Here: The Rise of Standards

The story of open data is similar to the story of the World Wide Web. Not only does the World Wide Web and the Open Data Movement share a thought leader - Sir Tim -they also share a story in the necessity of standards and design pattern as part of their maturity models.

In 2012 I was finishing a career as a website services manager that stretched back to 1997. In that time I had seen the World Wide Web rise from a small academic set of linked nodes into a tool that has become as indispensable as a phone or electricity.

Tim Berners Lee wanted to communicate data to his colleagues when he was a researcher at CERN in the late 80’s. Markup languages have been around for some time then but he wanted a light flexible language on which to created linked data.

In 1992 I got to see the first browser while I was a teaching assistant at Washington State University. By 1999 I was two years into a career as a dotcom developer and by 2000 I had been laid off 4 times as the WWW economy imploded. I went through 5 years of self-employment and start-ups and for a time I had doubts as to the viability of the web. Regulations and standards were slow to be created and even slower to adopt within browser communities. With the rise of Google enhanced findability and the acculturation of usability and design patterns came the rise of standards. Even Microsoft had to comply with the W3C in the end out of interoperable necessity starting with IE9.

In 2005 I began my career in government managing web and accountability reporting on the web for Durham Public Schools. I never forgot the academic roots of the WWW but for me the web had become a business tool. But the WWW for Tim Berners Lee was always about linked open data. For Sir Tim, the web should be semantic and should be machine readable.

In early 2012 I blogged about Raleigh’s Open Data Resolution, by the fall, I was tasked with creating Open Raleigh. Today I manage one of the most successful open data programs for a city of our size, I am part of The Open Data Institute and I am on the ground floor of a movement that had its origins way back in a CERN computing lab.

I also had my doubts about open data. In 2007 there was a meeting in Sebastopol California concerning the principles of open data. These principles are found more or less intact within the OKFN and the ODI. They are the principles with which Open Raleigh is governed.

Like all things shiny and new in the IT world, Gartner’s Technology Hype Cycle applies to open data. In 2010 there was hope and optimism surrounding Data.Gov and the transparency movement at the Federal level. By 2011 it was apparent that the Whitehouse had failed to acculturate the idea of “open” at the agency level. This failure was rooted in the lack of understanding of how data is traded internally within government. The 2011 budget for open data and Data.Gov in particular was slashed and open data seemed to be going the way of the Semantic Web.

The Gartner Technology Hype Cycle

In spite of these set backs the open data movement began to rumble from the ground up through local open data resolutions, meetups, hackathons and municipal open data programs in larger cities such as Seattle, San Francisco, Chicago and New York. Open data refuses to die. In the end, open data is a “bottom up” movement that is decentralized and growing in power every quarter. If 2011 was the “Trough of Disillusionment” then 2012 through 2014 can arguably be called the “Slope of Enlightenment” era. In October 2012 I attended a Red Hat event to celebrate its new Raleigh headquarters. While I was there I was wearing my name badge and job title of Open Data Program Manager, City of Raleigh. I was asked several times “what is open data?” I don’t get asked that question much anymore. In 2013 at the first US Regional Data Jam in Raleigh Whitehouse CTO Todd Park announced that indeed open data is more successful using a “bottom up” model based on municipal open data.

Where are we Going?

There are standards all around us and common tools we have all adopted for the sake of convenience. Standards usually start in the private sector and lowly saturate the market until either a new standard is adopted or until the ecosystem for a standard is so overwhelming other technologies arise to conform to it. There are a few exceptions including GPS. The current slope of enlightenment is comprised of early adopters experimenting and learning together. 

Technology Adoption Maturity Model
Everyone in this room or reading this on my blog is an early adopter. You will see the chasm of disillusionment fits nicely with early adopters paving the way for the “early majority” along to the nirvana “plane of productivity”.

How do we get there? What is the force that drives open data and how will open data evolve to be as indispensable as our smart phones and web tools?

The Future is About Standards…Sexy Standards

Companies introduce software products that create artifacts such as MP3s, Word and PDF docs. These are artifacts are then shared with colleagues, business associates and friends who then upgrade to the latest piece of software to consume these artifacts.

Remember Flash? At one time it was the standard for online video as well as animation. YouTube started with Flash Driven video content. Other standards arose that were more efficient. Products such as the iPhone and iPad refused to carry the Flash standard.

Some standards evolve rather than die. Microsoft Office formats are proprietary and not open source. Though open standards exist Microsoft continues to evolve these proprietary standards and tools often co-evolve to include them. Socrata, our open data platform, accepts .xl and .xls formatted spreadsheets and converts them into CSV. There is no central governance around these standards or who should adopt them. Certainly there is no national policy regarding document formats such as PDF, .doc, .xl, .odt. Even without a policy one continues to find data stored within these formats throughout government.

So, one could argue that standards within technology often arise from adopting a software product. The standards arise when the product offers enough features and has a large enough audience to make not owning that product an inconvenience.

Herein lies the future of open data. Products will drive open data adoption. There are early attempts already with the Microsoft and Socrata collaboration. Socrata has released a data tool into the Microsoft 8 App Market Place. Android and Apple both offer and open source Open Data Toolkit app. This won’t happen with products alone. Again we can use Apple products as an example. The iPhone and iPad are platforms. These platforms are arguably on par with other platforms. There is no real qualitative difference between one communications platform and another. What makes iPhone and iPad so powerful are the applications that one can download to access…wait for it…data. Data comes in gaming applications as well as business applications.

What is next in open data? The rise of standards through reuse is the next step. Recently Vanderbilt University launched the POISE National Science Foundation Project. Raleigh was selected to be a participant. POISE is a shared data platform used by six cities to create a much larger data eco system. In short POISE represents the emerging trend of regionalism and the power of the cultural versus the jurisdictional data ecosystem. Imagine one day there is not just Open Raleigh. Imagine all of the municipalities in the Triangle interconnected and sharing data. The barrier of entry to creating apps in an open government data market place would drop dramatically. Suddenly we would see a larger number of software applications across a variety of platforms that could give rise to an economic surge through technology product powered by large interoperable open data markets.

What is next in open data? You tell me. You are the next stage in open data. Take our data. Use our data as infrastructure. Treat government data as a strategic asset paid for by the people and made available for the people. Be the person, team or company that builds what is next in open data.


Popular posts from this blog

Podcast: Open Data Discussions with Anthony Fung

Open Data Licensing