With our open government narrative, we’ve been pointing out flaws in the ways our public records laws are put into practice, and discussing how an open data policy would allow San Diegans to extract useful information from public records.
Open government reforms could take a while to crystallize here, but we want to provide a glimpse into the future by unlocking data from documents that aren’t easy to analyze in their current form.
Let’s start with a report on ethics hotline complaints that the San Diego Unified school board’s audit and finance committee released last month in a form that is far from open.
The hotline, which has been live since 2006, is the place where school district employees and the public can confidentially report allegations of wasteful spending, theft of district funds and misconduct.
Problem 1: The document is not searchable
Public records can be lengthy and dense. And sometimes all you want to see is one specific thing you’re interested in. In this case, one of those interesting things is fraud.
If San Diego Unified had made this document searchable before posting it online, I would be able to use a simple keyboard command to find the word fraud.
But I can’t do that with this document in its current form.
Problem 2: It’s still not data
There are ways to make documents searchable after they’ve been published, but the process can be time-consuming and difficult.
In this case, the fix was easy enough. I told my document reader to “recognize” the text. But I couldn’t easily copy the information I wanted into a spreadsheet program that would enable me to see how different chunks of information relate to one another.
The hotline complaint categories were ordered alphabetically. But maybe I want to arrange the number of complaints in each category from largest to smallest so I can see what the biggest problem areas are. Or maybe I want to view the complaints as proportionate slices of a pie.
There are programs that will do that for me, but again, the costs and the amount of time it takes to make that happen varies. And that’s why open data advocates are calling on public agencies to save you the headache by releasing information in a common, easy-to-use form.
Problem 3: It’s not clear what the categories mean
Nothing in the tables that San Diego Unified released tells us precisely what the categories mean. We can easily guess what they mean for designations like “theft of time.” But what about “policy issues” and “other”? And what kinds of fraud are they talking about?
Since they’re referring to abstract concepts here, a glossary of terms would help. To their credit, the district’s auditors give a few examples of fraud on their website, but the hotline report doesn’t direct the reader there.
Solution: Sometimes you have to break the data open manually
There wasn’t a whole lot of information in the ethics hotline report that San Diego Unified released, so I put the statistics into a spreadsheet program, and then used a free program to make some interactive graphics.
Here’s what I did with the 32 cases that San Diego Unified School District’s Office of Audits and Investigations completed from July 2012 to June 2013:
And here’s what I did with the 123 cases that remained open as of Sept. 4, 2013:
We don’t have a complete picture of what’s happening with San Diego Unified’s ethics hotline here, but we’re further along than we would have been otherwise.