This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this blog post we discuss the Pollard rho, Pollard "kangaroo" and parallel Pollard rho attacks on ECDLP.
Our aim is to solve the discrete logarithm problem, h = gx
for any cyclic finite abelian group G. Thus, assuming that we have a cyclic group G = ⟨g⟩, which has prime order p, we want to find the
value of x modulo p such that h = gx when we were also given an h ∈
G. The problem
with the Baby-Step/Giant-Step method is that although its run time complexity is O(√p), it also requires O(√p)
space. Hence, we are interested
in replacing the large space requirement for a smaller space requirement, but maintain
a time complexity of O(√p). This task can be achieved with
the following algorithms. [1]
1. Pollard’s
Rho Algorithm.
Let f : S → S be a random
mapping between a set S and itself, n is the size of S. For a random value
x0 ∈ S we compute xi+1
= f(xi) for i ≥ 0. Each step xi+1 = f(xi) is a
deterministic function of the current position xi. The values x0,
x1, x2, . . . are considered as a deterministic random
walk.
Since S is
finite we will eventually obtain xi = xj thus xi+1 =
f(xi) = f(xj) = xj+1. Hence, the
sequence x0, x1, x2, . . . , will eventually
become cyclic (“pho” shape: ρ). Our goal is to find a collision
in a random mapping like the one above, which means to find 2 values xi
and xj with i≠j such
that xi =xj.
To find a
collision we use Floyd’s cycle finding algorithm: Given (x1,x2)
we compute (x2,x4), then (x3,x6)
and so on, i.e. given the pair (xi, x2i) we compute (xi+1,x2i+2)
= (f(xi),f(f(x2i))) and we stop when we find xm
= x2m. It is m=O(√ n).
For the discrete logarithm problem we partition group S into three sets S1,S2,S3. We assume that 1 $ \in $ S2, and define the following random walk on the group G, following random walk on the group G: xi+1 =f(xi)=h·xi when xi ∈S1, xi+1 =f(xi)=x2i when xi ∈S2, xi+1 =f(xi)=g·xi when xi ∈S3. We actually keep track of (xi, αi, bi) where αi+1 = αi when xi ∈ S1, αi+1 = 2αi (modn) when xi ∈ S2, αi+1 = αi+1(modn) when xi ∈ S3, and bi+1 = bi+1 (modn) when xi ∈ S1, bi+1 = 2bi (modn) xi ∈ S2, bi+1 = bi when xi ∈ S3.
Starting with the triple (x0,α0,b0) = (1,0,0), then for all i we have logg(xi) = αi + bi logg(h) = αi + bix. Applying Floyd’s algorithm we are able to obtain a collision, thus find a value of m such that xm = x2m. This means that am + bmx = a2m + b2mx or (bm − b2m)x = a2m − am and if bm $ \neq $ b2m, we obtain x = $ \frac{a_{2m} - a_m}{b_m - b_{2m}} (mod n) $
Assuming that the sequence x0,x1,x2,... is produced by a random mapping from G to itself, then the above algorithm will find the discrete logarithm in the expected time O(√ n).
2)
Pollard’s Kangaroo Method.
Pollard’s Kangaroo
method is like the Rho method but it is particularly tuned to the situation
where we know that the discrete logarithm lies in a certain interval x ∈ [a,...,b].
Let w = b − a be the length of the interval in which the discrete logarithm x is known to
lie. We define a set S = {s0,...,sk−1} of integers in non-decreasing order and its mean m should be around N =√w. We usually choose si = 2i for 0 ≤ i < k (thus the mean of the set is m = $ \frac{2^k}{k}$) and also k ≈ $ \frac{1}{2}$ log2(w). The group is divided up to k sets Si, for i = 0, . . . , k − 1. We then define the deterministic
random walk: xi+1=xi·gsj if xi∈Sj.
We compute the deterministic random walk, starting from g0 = gb, by setting gi = gi−1 · gsj for i=1,...,N. We also set c0 =b and ci+1 =ci+sj (mod q). We store gN and notice that we have computed the discrete logarithm of gN with respect to g, which is cN =logg(gN).
Now we have to compute the second deterministic random walk starting from the unknown point in the interval x. We set h0 = h = gx and compute h i+1 = hi · gs′j . We also set d0 = 0 and di+1 = di +s′j (mod q). Notice that we have logg(hi) = x + di.
Hence, if the path of the hi meets the path of the gi then hi will carry on the path of the gi. We will then be able to find a value M where hM equals our stored point gN .
Thus, we will have cN = logg(gN) = logg(hM) = x+dM, and the solution to our discrete logarithm problem is given by x = cN − dM (mod q).
If we do not get a collision then we can increase N and continue both walks in a similar manner until a collision does occur. The expected running time of this method is √w and the storage can be seen to be constant.
Now we have to compute the second deterministic random walk starting from the unknown point in the interval x. We set h0 = h = gx and compute h i+1 = hi · gs′j . We also set d0 = 0 and di+1 = di +s′j (mod q). Notice that we have logg(hi) = x + di.
Hence, if the path of the hi meets the path of the gi then hi will carry on the path of the gi. We will then be able to find a value M where hM equals our stored point gN .
Thus, we will have cN = logg(gN) = logg(hM) = x+dM, and the solution to our discrete logarithm problem is given by x = cN − dM (mod q).
If we do not get a collision then we can increase N and continue both walks in a similar manner until a collision does occur. The expected running time of this method is √w and the storage can be seen to be constant.
3) Parallel
Pollard’s Rho Method.
When we use random walk based techniques for solving discrete logarithm problems we often use a parallel Pollard's version. Assuming that we are given the discrete logarithm problem h = gx in a group G of prime order q, we first decide on an easily computable function H : G → {1 , . . . , k} (k is usually around 20) and then we define a set of multipliers mi. These are produced by generating random integers ai, bi ∈ [0, . . . , q − 1] and then setting
mi=gaihbi.
To start the deterministic random walk we randomly pick s0, t0 ∈ [0, . . . , q − 1] and compute g0 =gs0ht0. The deterministic random walk is then defined on the triples (gi,si,ti) where gi+1 = gi · mH(gi), si+1 = si + aH(gi) (mod q), ti+1 = ti + bH(gi) (mod q).
Hence, for every gi we record the values of si and ti such that gi =gsihti.
If we assume that we have m processors, then each processor can start a different deterministic random walk from a different starting position using the same algorithm in order to determine the next element in the walk. When two processors (or the same processor) meet an element of the group that has been seen before, then we obtain the equation gsi hti = gs′j ht′j from which for the discrete logarithm x can be solved.
We expect that after O($\sqrt{πq/2}$/m) iterations of these parallel walks, a collision will be found and the discrete logarithm problem will be solved. However, this means that each processor needs to return every element in its computed deterministic random walk to a central server which then stores all the computed elements. This is highly inefficient due to large storage requirements, namely O($\sqrt{πq/2}$).
Moreover the storage can be reduced to any required value as follows: We define a function d on the group, d : G → {0, 1} such that d(g) = 1 around 1/2t of the time. The function d is often defined by returning d(g) = 1 if a certain subset of t of the bits representing g are set to zero for example. The elements in G for which d(g) = 1 will be called distinguished.
To start the deterministic random walk we randomly pick s0, t0 ∈ [0, . . . , q − 1] and compute g0 =gs0ht0. The deterministic random walk is then defined on the triples (gi,si,ti) where gi+1 = gi · mH(gi), si+1 = si + aH(gi) (mod q), ti+1 = ti + bH(gi) (mod q).
Hence, for every gi we record the values of si and ti such that gi =gsihti.
If we assume that we have m processors, then each processor can start a different deterministic random walk from a different starting position using the same algorithm in order to determine the next element in the walk. When two processors (or the same processor) meet an element of the group that has been seen before, then we obtain the equation gsi hti = gs′j ht′j from which for the discrete logarithm x can be solved.
We expect that after O($\sqrt{πq/2}$/m) iterations of these parallel walks, a collision will be found and the discrete logarithm problem will be solved. However, this means that each processor needs to return every element in its computed deterministic random walk to a central server which then stores all the computed elements. This is highly inefficient due to large storage requirements, namely O($\sqrt{πq/2}$).
Moreover the storage can be reduced to any required value as follows: We define a function d on the group, d : G → {0, 1} such that d(g) = 1 around 1/2t of the time. The function d is often defined by returning d(g) = 1 if a certain subset of t of the bits representing g are set to zero for example. The elements in G for which d(g) = 1 will be called distinguished.
It is only the distinguished group elements which are now transmitted back to the central server, which means that we expect the deterministic random walks to continue another 2t steps
before a collision is detected between two deterministic random walks. Hence, the computing time
now becomes O($\sqrt{πq/2}$/m+2t) and storage becomes O($\sqrt{πq/2}$/2t). Thus, storage can be reduced to any manageable amount, at the expense of a little extra
computation.
[1] http://www.cs.bris.ac.uk/~nigel/Crypto_Book/book.ps (pages 208 - 214)
[1] http://www.cs.bris.ac.uk/~nigel/Crypto_Book/book.ps (pages 208 - 214)
I'm really interested to read this but there seems to be a few problems rendering the LaTeX - could you check the script please?
ReplyDelete