## Friday, January 30, 2015

### Discrete Gaussian Sampling and the Quest for the Shortest Vector

The tl;dr:
When it comes to short vectors in lattices
The state-of-the-art apparatus is
Discrete Gaussian sampling
For use (for example) in
Reduced time and space cryptanalysis...
This week’s study group, presented by Joop, was on Solving the Shortest Vector Problem in 2n Time via Discrete Gaussian Sampling, a 2014 pre-print release by Aggarwal, Dadush, Regev and Stephens-Davidowitz. The lattice, to my regret, is not a creature I frequently encounter in the course of my own research, so I report here from a considerably less-than-expert perspective. But hopefully, if I stick to the high-level details then all shall be well.

The motivating theme of the paper is the Shortest Vector Problem (SVP) — an important computational problem on lattices: given a basis for a lattice, find a non-zero vector in the lattice of minimum Euclidean norm. Many cryptographic primitives (notably including breakthrough developments in fully homomorphic encryption) derive their security from the worst-case hardness of SVP and related problems.

Previous algorithms to find exact or approximate solutions to SVP fall into three main classes: enumeration, whereby basis reduction is combined with exhaustive search inside Euclidean balls (time ranging from 2O(n log n) to 2O(n2), polynomial space); sieving, whereby randomly sampled vectors are iteratively combined to generate increasingly short vectors (2cn time (c>2), exponential space);  and recent proposals using the Voronoi cell of a lattice -- the real region in which all points lie closer to the zero vector than to any other (22n time, exponential space).

The intuition behind Discrete Gaussian Sampling (DGS) is to draw from a (narrow) distribution of lattice points centred around zero. As the width of the distribution (characterised by the parameter s) decreases, the samples become more and more concentrated on short vectors; sufficiently many samples for an arbitrary s will eventually land on a solution to SVP. A simple idea but, of course, far from simple in practice. In particular, for any given lattice there is a smoothing parameter s* -- informally, the 'tipping point' at which the distribution begins to 'look like' a continuous Gaussian rather than a stack around the central point. Efficient algorithms are known for s much larger than s*, but these are inadequate to solve exact lattice problems. A key contribution of Aggarwal et al. is an algorithm that samples for any parameter s > 0; it returns 2n/2 independent discrete Gaussian distributed vectors using 2n+o(n) time and space. They add to this a second algorithm for "above smoothing" sampling with a substantially improved time and space complexity (2n/2 -- the current record for fastest provable running time of a hard lattice problem), and show that it can be used to approximate decision SVP to within a factor of 1.93.

Here's where I muster all my expertise-less enthusiasm and try to explain the (high level) details of the algorithms without saying anything too stupid…

Both algorithms operate by iteratively combining vectors drawn from carefully chosen high parameter distributions (from which it is well-known how to sample efficiently) in such a way that the dispersion parameter of the distribution of the combined vectors is progressively lowered. As an aid to intuition, consider, for a moment, the counterpart continuous case: the distribution of a (continuous) Gaussian-distributed vector divided by two is similarly Gaussian distributed with half the width. In the discrete case, though, dividing by two generally does not produce another vector in the lattice. A 'fix' would be to sample from L with parameter s', keep those that were also in 2L, and divide them by 2 -- but the "loss factor" of doing so can be high (the probability that a sample from L is also in 2L can be as small as 2-n), so one needs to be cleverer.

The 2n-time below-smoothing sampler looks for pairs of vectors sampled from lattice L with parameter s whose sum is in 2L (equivalently: vectors in the same coset mod 2L). The 'greedy' combiner which pairs as many as possible in each coset-related bucket would have a loss factor of just 2 in reducing from the sampled vectors to the shorter combined vectors. However, below the smoothing parameter the required distributional properties are lost in the combining step. The workaround to produce combined vectors with the correct distribution is very involved; I caught something about coin-flips and Poisson distributions, but the sterling efforts of our presenter to render comprehensible the technical details did not take sufficient root in my brain for me to attempt the same here -- I refer you to the paper at this point! From an input sample of order 2n, combining multiple times according to the method they devise reduces s as required with a total loss factor of 2n/2, thus outputting on the order of 2n/2 sampled vectors.

The 2n/2-time above-smoothing sampler is not just a tweak on the first algorithm but a surprisingly different approach supplied with its own set of clever tricks and insights. The intuitive idea is to construct a ‘tower’ of increasingly sparse lattices [L0,…,Lm] where Lm = L, the lattice one wishes to sample from. Each Li is chosen to ‘lie between’ Li+1 and 2Li+1 — that is, it is a (strict) sublattice of Li+1 which contains 2Li+1, i.e. Li+1Li ⊆ 2Li+1. Because L0 is dense, one can sample ‘easily’ from it with a small parameter (2-m/2s, in fact), and combine (by summing) over cosets of L1 to get samples over the sparser lattice 2L1 with a slightly increased parameter 2-(m-1)/2s. Repeating this process eventually produces samples from Lm = L with distribution statistically close to the discrete Gaussian with parameter s, provided s is above the smoothing parameter. They show that this method is able to approximate decision SVP to within a factor of 1.93.

That’s all we had time for in our session — and already, I admit, somewhat more than I have the capacity to fully absorb without some serious preliminary study! (Although, the bits I was able to follow, I found very interesting). The rest of the paper, as well as containing all the ‘proper maths’ (including the formal reduction from SVP to DGS), covers applications, adaptations to the closest vector and other related problems, and relevant points for discussion, all in some depth. Despite the clear theoretical value of their contribution, the authors are circumspect about the practical usages of their algorithms relative to enumeration and sieving methods. These latter perform well heuristically in relevant scenarios, whilst the run-time bounds of DGS are tight in practice, with no trade-off available even if one wants to sample fewer than 2n/2 vectors.