Twenty people gathered at a good work room provided by Odd-e on Connaught Road West. Very nice of Odd-e to donate the space. Four of us pitched 4 issues and we quickly formed groups. Take a look at the hackpad created before, during and after the Open Data Day.
- List of vacant primary and secondary schools in Hong Kong- how many and how could the public more easily know what is going on with a vacant schools (3 people)
- Cops / Bad Cops – reporting to the police / reporting by the police (3-4 people)
- Impact of Environmental Policy on actual air quality improvements (6-7 people)
- A crowdsourcing database of missing people in Hong Kong (4-5 people)
- Submitted after the day about pay-walls and how they enable misrepresentation and possible fraud.
Sammy Fung, the day’s organizer, was a floater during the day. We are all very grateful for Sammy’s efforts to make the day possible.
We were well supplied with coffee, drinks, snacks and we worked throughout the day. There was some exchanges between the groups on ‘how to do x or y’. Generally though the conversations were within the groups until the end of the day presentations. Good progress was made by 6pm. Below are my observations and short summaries of what was accomplished during the day. These are drafts. I’ll very likely change them once I have feedback. There are photos and other details on the Open Data Day Hong Kong 2016 Facebook page.
The theme connecting the 4 issues is ‘lack of trust’. There was no plan or prior discussion on the issues to be used for the Open Data Day. It’s an indication of how one group of HK citizens feel. Who are these people? Young, middle-age and frankly old. Male and Female. All have some sort of technical, statistical, analytical experience or inclination. Twenty people’s views cannot be extrapolated out to 7 million. How much trust do these 20 HK citizens have in the HK Government? Not much is the answer. Based on the tentative data gathered and analysed around the 4 issues, the HK public should listen carefully and ask questions when the HK Government claims it’s doing its best to be open, transparent and truthful.
(1) Take a look at Public Accounts Committee of the 5th Legislative Council P.A.C. Report no. 65, part 8., chapter 3, page 135, clause 25. The EDB admits it misrepresented to a Legco member’s question the total number of vacant schools. The EDB responded there were 108 vacant schools when there were 234 vacant schools at that time, 2015-2016. How many vacant schools are there and can we verify school vacancy without relying solely on data supplied by HK government departments?
Ans: The Education Bureau’s has a website of ‘all schools’ with 11,077 and a dataset of ‘all schools’ with 3,507. Why such a huge difference? The answer is when is a school a school. All 11,077 are schools. All 3,507 are schools. The EDB’s definition of ‘school’ is both vague and precise. The discrepancy is the bigger list includes all of the tutor, cram or learning centre schools and other special schools, whereas the shorter list is only kindergartens, primary, secondary. The numbers never quite match up. The shorter list provides longitude and latitude of ‘schools’ and these can be mapped into Google Maps. Text analysis of the descriptions from Google Maps may reveal if this is currently a ‘school’. Automating the process is the key challenge. After a few cycles the number of vacant schools should become clearer. We don’t know for sure but it’s likely many more than the 29 the Education Bureau reported on 17 February 2016. More will follow …
(2) The HK Police release glowing praise letters they receive. The HK Police seem less inclined to release complaining letters they receive. However, some complaining letters are released. By applying text analysis on the praising letters and on the complaining letters released to the HK public what does it reveal?
Ans. The team managed to completely automate the collection and the analysis process. This is impressive for a day’s work. The interpretation is coming. More will follow …
(3) The air quality has improved over the past 10 years we are told by various government groups and NGOs. By collecting and analysing the daily air quality released to the public does it reveal statistically significant improvement in the air quality over a period of time?
Ans. The data was collected and analysed. There is measurable change. Is it statistically significant change or simply random? More will follow …
(4) Hong Kong is a city with millions of people living and passing through every year. Some people go missing. Could using crowd-sourcing along with the HK Police missing persons site help family and friends find the missing people?
Ans. Mixing and matching data scraped from public websites was done. The HK Police were claiming ‘personal privacy’ concerns when a member of this group enquired a few days before Open Data Day. What will be the reaction to scraped data combined with details supplied by family and friends? More will follow …
(5) David Webb, an investor activist, in Hong Kong provided, ‘Deception behind the Companies Registry paywall‘ for the Open Data Day. David writes on his useful, free, open and transparent Webb-site Reports,
“On International Open Data Day, we reveal a network of knock-off companies using the CIBC, Credit Suisse and BNP brands, based in HK with subsidiaries in the UK and New Zealand. If those registries were not free and open, the deception would remain undiscovered. We call on HK Registrar Ada Chung to tear down this paywall.”
Hong Kong with massive government reserves collects small amounts of money from its citizens. Why go to such trouble for unneeded revenue?
The Hong Kong Government’s Office of the Chief Information Officer hosts the Data 1 site, DATA.GOV.HK. The webpages do look much better than they did a few years ago. The datasets seem about the same but a careful comparison may reveal improvements. I note there are 15 applications ‘showcased’ as examples of:
“creative web and mobile applications and solutions developed by the Government and community with DATA.GOV.HK datasets. These examples will demonstrate the potentials of the public sector information provided in digital formats.” (from Applications)
All are interesting applications of open source public data. However, fifteen applications seems rather paltry. Why can’t the OGCIO provide a more comprehensive list of the applications which uses these datasets? Let’s hope for improvement by Open Data Day Hong Kong 2017.