On Saturday 6th March 2021, the eleventh Open Data Day took place with people around the world organising hundreds of events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme.
This event received an Open Knowledge Foundation mini-grant thanks to support from Microsoft.
At Code for Pakistan, we gathered 10 volunteers to clean and publish data related to the billion trees afforestation project (Phase II). We chose this project as it has been one of the biggest projects initiated by the current government to overcome climate change but at the same time had been under scrutiny due to corruption allegations.
The event started with the team evaluating the available data in the monitoring report published by WWF (third party). After understanding the datasets and their correlations, the team came up with a plan to share a consolidated dataset containing the location data of nurseries and the total number of tree saplings sown in the area. With this subset of data, the team was only able to map around 12.6% of the total trees planted (1 billion) as the rest of the information (number of plantation) was not published in the report. The motivation behind working on this subset of data was to publish something meaningful for the developers as well as plotting the available data on a map to make it understandable for the public.
The team was divided into three parts. The first team was responsible for segregating and cleaning the datasets, the second team combined the data from different tables, processed it, and converted it into a machine-readable format and the third team worked on a public-facing dashboard where these coordinates were to be mapped.
At the end of the event, the team was able to publish the dataset (https://www.kaggle.com/aliirz/billion-tree-tsunami) as well as the map of all the coordinates through ArcGIS (https://arcg.is/1jmbjm). This proof of concept will now be shared with relevant government agencies working on the climate and environment to open further datasets that can make this initiative more transparent and accountable.
Volunteers from @CodeforPakistan working on opening up environment data on this #OpenDataDay. More details coming up soon!#OpenDataDay2021 #ODD2021 pic.twitter.com/Rg2GlysDmR
— Code for Pakistan (@CodeforPakistan) March 6, 2021
Play smart: To get started with an open data project, there is already tons of data available online in legacy formats that can be used. This data can be in the form of PDF reports either published by the agencies themselves or by third parties.
Gain trust: This raw data can be used to develop a proof of concept and can then be shared with the relevant agencies to get them excited. The proof of concept can also be handed over to the agencies to explore other possibilities.
Open data: Once these agencies see value in opening up the data and how it can be used to achieve their organisational goals, they will come on board and provide access to the required data. That is where the real magic happens!
It is always a pleasure to see so many like-minded people coming together to show the importance of open data and how can it be used to fight the biggest challenges being faced by the public sector across the globe. The sense of having a community working towards the same goal as yours is priceless.
Source of Data: https://wwfasia.awsassets.panda.org/downloads/btap_monitoring_report_phase_ii.pdf
Map: https://arcg.is/1jmbjm
Datasets: https://www.kaggle.com/aliirz/billion-tree-tsunami
Code: https://github.com/codeforpakistan/Open-Data-Day-2021