USAID DataJam Notes from the Field


Todd Park at USAID DataJam December 10th, 2012

I was honored to participate in the Global Development Data Jam at the Eisenhower Executive Office Building at the White House. The group in attendance represented several NGOs, private sector organizations and federal agencies associated with USAID.

The theme of this Data Jam was open data being used to enhance effectiveness in program spending and to create transparency in foreign spending by USAID.

USAID DataJam Themes included:
The future of the data economy driven by open data and big data
Open data as infrastructure
The correlation between lack of data access and poverty
Open data as the next big economic multiplier
Crowdsourcing data cleansing activities
Open data is not usable data
Social media for situational awareness and crisis management (
Speaker Notes
Todd Park: U.S. Chief Technology Officer, The White House
What is the next big thing? GPS was released in the early nineties which led to billions of dollars flowing into the economy from mapping technologies and mapping data.
Think about the data sets that will lead to economic development.
Demographic and health surveys
Maura O’Neill: Chief Innovation Officer, USAID; Nate Manning: Whitehouse Innovation FellowData by design:
Data is information capital
Data is infrastructure
Data by Design means data in machine readable format Openness = economic multiplier effect:
iHUB in Kenya enables entrepreneurs and “doers” to create wealth through the use of data and applications
Open data = collaborative development
Sharing data is “empathy in practice”
Project level data can be collected via cellphone
Tertiary data such as data health surveysExample use of data for economic benefit
Tea industry measuring climatic risk using NOAA data sets.
Using NOAA data to predict frost and avoid crop loss.
Stephanie Grosser, Communications Specialist USAID and Shadrock Roberts, GIS Analyst USAID’s GeoCenterEngaging the crowd toward open development: crowdsourcing

Crowdsourcing for data cleansing at

300 Volunteers
10k records
16 hours
85% accuracy. This accuracy level was higher than the data cleansing algorithm developed to automate the job.Cleaning data is the biggest issue in any data initiative

Patrick Mierer, Director of Social Innovation at QCRIWays to deal with crisis management using crowdsourcing and social media data.

Typhon Pablo UN Map was first map made entirely of social media data. collects and filters all social media content related to an event or geo-location. Can be mapped to help with crisis management.
Twitter heat-map
Crowd map
Collecting eyewitness accountsUSAID is using crowdsourcing and social media mining to create maps to manage crises.

Data to create these crisis maps is readily available (

Policies should be driven by real-time data.

Emmanuel Kala, UshahidiGRIT: Try, Fail, Learn, Succeed

Ushahidi is open source software for information sharing, mapping and visualization

SwiftRiver crowdsourcing the filter.

SwiftRiver is a product from Ushahidi
Swiftriver solves the “burning house” problem. How to find the single point of data in a firestorm of data?Umati:

Monitors hateful speech.
Umati product is used to control election violence in Kenya.
USAID IssuesFrom Todd Park. "What is the next GPS of development? What are the vital datasets that should be broadly available to enable innovative solutions?

Examples using the GPS analogy include

Location-enabled servicesData are essential infrastructure for development. Making data broadly available will speed up an evidence-based process of planning, implementing, measuring, and adjusting. Engaging the crowd to clean or digitize datasets, map infrastructure, and do other related tasks can be very successful, and all the tools needed are available.

Examples of crowdsourcing data cleanup:

USAID cleaned 10,000 records in 16 hours with 300 volunteers at 85% accuracy
Ushahidi's SwiftRiver enables users to let the crowd filter and verify data and organize and present the results.Existing social media and mobile phone usage data can be mined for early detection, real-time feedback in disaster assessment and prediction of health trends (flu trends, food prices).

UN Global Pulse's Robert Kirkpatrick showed a number of great examples, including many from developing countries.

100 million mobile users in Nigeria
100,000 new Facebook users per month in Senegal
Jakarta is one of the world's "tweetiest" cities
24% of residents in Mogadishu check into Facebook at least once a monthMore organizations and platforms provide comprehensive access to their data Examples include

UNDP (last month)
Millennium Challenge Corporation
Foreignassistance.govOpen data can be done anywhere. Development Seed's Eric Gundersen featured an open data platform for Election Data in Afghanistan.

DecisionsDevelopment funding needs more coordination.

AidData and partners geocoded all 550 current development projects in Malawi with a volume $5.6bn.
The World Bank is mapping and sharing data for their project portfolio (Mapping for Results). More countries and donor agencies should do the same.Without data scientists, you can "share data until the cows come home" without results. This is from DataKind founder Jake Porway.

Lack of data scientists is a key issue in development that is only partly mitigated by organizations like DataKind.
We need more training in statistics for civil society groups, journalists, and others.
USAID Administrator Rajiv Shah’s summary: The single biggest thing we can do to eradicate poverty is to provide data access and data analysis. Open Data are turning into essential infrastructure for development. Events like the Global Development Data Jam help connect people, organizations, and fields within the development arena.


Popular posts from this blog

Open Data Licensing

Podcast: Open Data Discussions with Anthony Fung