Open Government, Open Data & Cybersecurity | An Interview with San Mateo's Open Data Manager
Today, Garrett Dunwoody brings us insight into open data and security at the local level for National Cybersecurity Awareness Month. Garrett spent a decade involved in the public sector geospatial space before taking on the role of Open Data Manager for San Mateo County, California.
Garrett is passionate about open government, a still a relatively new idea for most public sector organizations. Therefore, I thought defining it would be a good place to start.
DLT: Let’s begin with basics: What are open data and open government?
Dunwoody: Open data is data that is accessible, re-publishable, and transparent to everyone; it’s one component of open government.
The open government philosophy is set to redefine the citizen/government interaction so that government can be more transparent and accountable about tax payer money, more participatory by engaging with citizens to add collective value to government, and collaborative through the use of innovative tools and methods to bind all levels of government with nonprofit organizations, business and individuals.
Open government is the umbrella and open data is a spoke, though an important one.
DLT: How does an open data manager fit into government?
Dunwoody: That depends on the organization. For us, my role is as a thought leader, a strategist, and an implementer. I work across departments to gather buy-in and collect data. I also analyze costs, looking for a way to reach net-cost-zero with projects and promote the value of open data to various owners. I also oversee the creation of the tools and platforms that distribute the data. I am the facilitator of our open data program.
DLT: Your agency collects and produces a ton of data. How much emphasis do you place on cybersecurity?
Dunwoody: There are two parts to your question.
First, the data we collect is not security-sensitive because the data is public anyway. It’s data that if a citizen requests it, I have to give it to him or her. That doesn’t mean we don’t protect it though.
Which leads me to point two: We have a strict policy regarding which cybersecurity vendors we choose. Each company we vet has to fill out an eight-page cybersecurity questionnaire customized to our unique environment. These questions deal with redundancy, security, third-party audits, disaster recovery, and maintenance. You’ll hear me say this often, but there’s no one-size-fits-all solution to most IT.
We disperse data through various platforms. It’s important that we protect users using those platforms and our own internal data – things like performance metrics. We also have an IT Security Officer to ensure we remain secure.
DLT: I know your platforms are cloud-based. Do you have any cloud security concerns?
Dunwoody: We certainly keep that in mind. We vet any service vendor we use through the same cybersecurity process I mentioned above. It’s important that we meet consistent, strict guidelines across our assets.
DLT: Do you use open source? Are there any security concerns there?
Dunwoody: We do use open source software. Our vendor uses open APIs that use Java.
We, however, don’t publish any open source tools. Eventually, I’d like to see us do that and, in the spirit of open, we’d share that code through a service like GitHub. Open is not a set of boxes you check off, it’s a culture. Sharing what we’ve learned and developed is part of that mindset.
This is a bit of an aside, but as I see it, the evolution of government open data is in three phases.
Phase one is what I’ll call “DWYC” (Do What You Can). This is exactly as it sounds. For example, three years ago someone requested every single GIS resource we had. That’s a lot of data and it’s a lot more when you consider that this person could ask for it every day. So I created a download site and wrote script to take that information and automatically publish it that site. That’s a DWYC approach; it’s all in-house.
The next phase is “bought tech.” An organization recognizes the importance and benefits of open data so they go out and buy technology to help facilitate it.
Finally, an organization really recognizes the importance of open data and identifies that it needs customized solutions so they enter the “venture” phase. They begin either creating their own apps or partner with companies to create unique software. This demands a bigger team and a bigger budget than the other two phases.
For instance, San Francisco worked with Yelp to integrate their health inspections with restaurant reviews. Instead of just reading what people think of the food, a person can now find out what the Department of Public Health thinks about the restaurant. New York City has taken it a step further and is dedicating tens of millions of dollars to build their own tools that combine multiple data sets across the entire city.
Within a year, we’ll begin creating our own tools.
DLT: Have you seen an increase in cyber attacks since moving to open data?
Dunwoody: Yes, but we only began releasing these platforms at the beginning of the year so we don’t have exact trends to report – yet. What we did see is an immediate spike in attacks at first, but over the months, the level of attacks evened out.
It’s critical that an organization recognizes that with a larger data landscape, you will attract more attacks. With the help of our customized questionnaire and vetting process for vendors, we’ve never had a problem or breach.
DLT: Are you addressing any security concerns with citizens?
Dunwoody: We did see early risk aversion from not only citizens, but from internal parties too, people from various departments. But that’s to be expected.
We address potential concerns by first asking ourselves questions like “Would this project expose us?” and “Is the data relevant?” By doing this, we hopefully anticipate what anxieties people will bring up. That makes implementation easier. We also explain the benefits and note the safety measures we’ve taken. In an open government and open data system, open communication is key.
Again, the data we currently release is mostly low risk so the risks for agencies and citizens are low. I recommend this method of slowly introducing the idea of open data, rather than immediately tackling high-risk data sets that could lead to citizen complaints.
The evolution of open data, I believe, should go as follows: city/community based data first. This kind of data is what you would find in the census; they are snapshots. Then we’d get personal, providing customized data. Finally, we’d begin collecting and creating new types of data ourselves. We’re currently somewhere between phase one and two.
DLT: You mentioned that there is some data that you currently collect that may be private. Can you give an example of that and how you keep that data secure?
Dunwoody: Sure. We developed what we call Open Checkbook. The platform allows us to share all county expenditures over $5,000. The benefit to citizens is government transparency. Now they know exactly where their tax dollars go and to what vendors. We even take screenshots of the exact check we write.
However, there are certain privacy concerns we have to keep in mind. For instance, we would not want to publish the name of a minor. We protect information like that through machine processing. We have scripts that run through the data and scrub it for privacy concerns. Then we have a second level of security called a redaction. This then takes the information the first phase identifies as sensitive, and removes those elements.
To extend our original example, we still take a screenshot of the check; however, the second phrase removes the minor’s name.
The key takeaways for me are:
- You can’t rely on all cybersecurity vendors to offer solutions that fit your IT environment. By creating a simple, customized questionnaire, you can save yourself from a lot of pain further down the road.
- Open data doesn’t mean you’re opening your systems up for hacks. Start with public data then scale your operations; however, expect resistance as you do so. Taking time to anticipate colleague and citizen concerns will help you navigate them.
- Before releasing data, ensure that you have set-up security redundancies. Measure twice, cut once.
If you enjoyed this conversation and want to know more, be sure to check back with us in November when we continue our open government discussion with Garrett. We’ll even tap into his geospatial information systems background to learn how San Mateo applies geospatial technology. You can also follow him on Twitter at @smcOpenData.