## Tuesday, May 31, 2016

### Workshop on the Theory and Practice of Secure Multi-Party Computation 2016: Stable Matching

Suppose we have two groups of people, $A$ and $B$, and each person in group $A$ lists each person in group in order of their preference for being partnered with, and vice versa. Is there a way of ‘optimally’ pairing each person in $A$ with a person in $B$? This was the focus of abhi shelat’s talk at the Workshop on the Theory and Practice of Secure Multi-Party Computation 2016 at the University of Aarhus, Denmark.

The problem is known as the Stable Marriage Problem and has wide field of application. Perhaps surprisingly, it can be shown that optimal solutions always exist. In fact, David Gale and Lloyd Shapely came up with an algorithm which constructs this matching (the achievement contributing in part to Shapely’s joint win of the Nobel Prize in Economics in 2012).

There are certain cases where it would be useful for the preferences made by each party to be kept secret. The application to the world of secure MPC then becomes clear. We were provided with two motivating examples.
• Matching prospective medical residents with hospitals.
• Matching women to sorority houses.

In these two cases, the data should be kept private. The latter example is based on a real-world instance of the problem in which, to avoid awkward social situations in which sorority houses received women whom they had not preferred, it transpired that one university had exported all of the data comprising lists of preferences to an impartial third-party in Texas, who could sort through it for them and make the assignment obliviously.

To perform the Gale-Shapely algorithm, we must have $O(n^2)$ arrays holding all of the preferences (where $n$ is the number of participants), and also an array holding the current matching. Additionally, loops in the algorithm need $O(n^2)$ time. As such, using garbled circuits turns out to be quite inefficient. The use of oblivious RAM (ORAM - you can think of this as something that interfaces between the CPU and physical RAM in a way that hides the precise data access pattern), enables a straightforward implementation, but again, not an efficient one. In an attempt to improve on this, abhi showed how to optimise the algorithm between 40 and 100 times.

First, we remove the necessity for ORAM arrays, which are used for: (1) checking the preferences, and (2) finding the next unmatched element. Analysing the algorithm and searching for optimisations allows (2) to be done through the use of a queue, which saves a lot in the implementation. The main optimisation, however, comes from a cheap way of performing (1), which in the original algorithm requires $O(n^2)$ space.

The contribution of the presented work consists of three particular insights leading to significant reductions in complexity of the algorithms:
1. Operations on the matrix of preferences are only ever ‘read-only’ instructions. The ORAM data structure contains extra space in case of rewriting - this is not needed! Noticing this means we can asymptotically reduce the complexity of searching through these preferences.
2. The preferences are only ever read in a specific order. Noticing this allows us to avoid the recursive lookups into the smaller ORAMs containing the preferences, which is expensive. This observation was made by Craig Gentry (here) for more general data structures.
3. Asymptotically better initialisation of the data can be achieved. This is through the use of what they define as oblivious multi-lists, which, roughly, are (obliviously) sorted lists of preferences of each pair in $A\times B$. This multi-list allows a much faster matching to be made, and the cost is: $\Theta(n^2 \log^2 n)$ for $n$ sorts on $n$ elements, and then $\Theta(n^2 \log n)$ for the random permutation on $n^2$ elements.

These optimisations means it takes less than 2 minutes for 600 participants for the sorority house example given above, which is at least 40 times faster than a straightforward ORAM application.

In practice, the sizes of the sets $A$ and $B$ will not often be the same. This generalised problem was studied by Roth and Peranson and somewhat complicates things, but can still be dealt with. It was suggested that this could be an interesting avenue for further research.