Monday, April 27, 2015

EuroCrypt 2015 invited talk by Tal Rabin: a privacy research roadmap


For some time now, privacy issues have been on the agenda of various agents, yet solutions remain fragmented, varied, and lacking the impact expected from them in practice. The talk exposed this state of affairs and proposed several points that can better structure the privacy research, its relation to technology, and its funding.

Benefits and doubts
A basic truth was recalled: computation on personal data is great for all sorts of areas, like healthcare, transportation, finance, internet of things, national security, etc. Obviously, the counterpart is a proliferation of personal data in a dense fog of devices, clouds, networks and databases of all kinds. Indeed, each application field may come with its own databases and devices, which seems reasonable at first, but we never know how these will be combined, by whom, and where. Privacy, whatever it means, can easily get lost in this mixture. Three big questions can then guide our research into future privacy frameworks:
  • Do they say what they do and do they do what they say ?
  • Are they usable ? 
  • How do they relate to policy requirements (e.g. of governments) ?
Ingredients for science (both in industry and academia)
The first question is one of rigour. A privacy framework has to start by formulating the problems it tries to solve, in a way that is precise, clear and generic. Once we have a solution, we need arguments why it solves the problem. What kind of arguments are acceptable as proof? Do we need to convince humans or, better, machines? A cloud or an app store may be able to check if a virtual machine or app comes attached with a convincing proof of privacy, or if the assumptions of the proof are not valid anymore.

Most of current solutions are rather ad-hoc. Not only do they fail to define the privacy problem, but also their scope is rather narrow, making them difficult to apply in other areas. The talk argued for generic solutions; they need to be designed at a higher level of abstraction, considering their context, their instantiation in applications (which can in turn open new problems and refine the design), various attacker models, etc. 

Dont forget the salt 
What does it mean for a system's privacy features to be usable? Does it mean the user doesn't have to bother about privacy? Or that he has precise control over data? Or he passes data custody to a third party ? It is certain that usability should come into play not only when implementing a protocol, but even earlier at design time, or at specification time. Usability is really about the way in which we collect data, so it is very much related to privacy protocols, which arrange that data for computation - an interplay between the two would certainly be fruitful.

Out in the fields
Multiple disciplines should be brought together in order to link the technology with the society in general, and with its privacy policies in particular. For example, research can make clear what is possible, or not, in practice, so policy requirements can take that into account. Other examples: formulate a trade-off between utility and privacy (needs some economics and law ?), and design an add-on to implement that; linguistics to identify private data in a text; social science to see how people care about privacy; etc.

Examples: indirection, composition, transparency
The last part of the talk contained some examples. First of privacy issues in health care, transportation and the internet of things. I guess these are well-known now. Then we had an example of indirect problems to which privacy is subject: suppose one has a family member who did a search about a genetic disease; putting a government database and a google database and the discipline of biology together, one could then violate ones privacy.

A cryptographic concept was mentioned as a positive example: privacy preservation under composition with other protocols. Yet, even compositionality breaks down if there's no framework to sustain it; how can P1 | P2 preserve privacy when P2 simply leaks some private data ?  Even worse, the public output of P1 and the leaked private data of P2 may leak more information than P1 or P2 alone.

Transparency of the way in which data is handled was another interesting example: it might improve privacy, but the industry wont go too far with it, so perhaps we need some new techniques to output evidence of appropriate handling of data, without disclosing the actual algorithms.

The government or the indutry? 
That was one of the questions debated at the end, and I guess we agreed both would play a role.

No comments:

Post a Comment