As data-sharing becomes more crucial, agencies say industry can help with privacy issues

Technology

As data-sharing becomes more crucial, agencies say industry can help with privacy issues

Agencies like the Census Bureau want better commercial off-the-shelf (COTS) technologies for protecting data privacy and computation, so they can securely link datasets and make predictions about the coronavirus pandemic.

The bureau launched two new surveys and an interactive data hub to begin filling holes in the government’s understanding of COVID-19’s social and economic impacts in April. But surveys take time and only offer a snapshot of the population, when the bureau could be linking data from text-mined emergency room visits to its own.

If industry could provide a better tool for securing the environment in which data is stored and analyzed, ensuring trust, then more datasets could be linked painting a comprehensive geographic and economic picture of the virus, said Cavan Capps, the big-data lead at the Census Bureau, during a Data Coalition webinar Wednesday.

Linking hospital administrative data to cell phone data like Apple and Google wanted would lead to very efficient contact tracing, but there’s not enough public trust in current technology, Capps said.

“When we’re actually making decisions, when we’re running these models, when we’re tracking people, do you want any individual to basically sign a piece of paper and say, ‘I promise I won’t tell anyone about you?’” Capps asked. “Or would you rather have more rigorous mathematical protections?”

Currently there is no “silver bullet” solution, said Lynne ParkerWhite House deputy chief technology officer. She pointed to several reasons: Data de-identification can be accidentally undone when the scrubbed data is combined with other sources of information. Data aggregation limits analytics. Simulating data raises concerns about accuracy and reverse engineering, while homomorphic encryption — which allows data to be mined without sacrificing privacy — hurts performance and speed.

Other techniques and technologies also have their weaknesses, she said. Data enclaves — centralized services favored by academia, where users can work with sensitive research data — don’t scale well. Differential privacy, or systems that publicly share information on group patterns while withholding information on individuals in a dataset, water down insights. And the security of multi-party computation, a subfield of cryptography that allows different parties to privately compute the same data, hasn’t been fully vetted.

“Much more needs to be done to create scalable solutions that are not just a point solution for a particular data sharing goal, but an approach that can scale to more use cases,” Parker said. “So I close with a call to all of you across industry, academia and government: What we need is a better pathway forward for addressing data sharing hurdles more quickly and in the shorter term.”

Continue Reading

As data-sharing becomes more crucial, agencies say industry can help with privacy issues