## Wednesday, March 25, 2015

### 52 Things: Number 25: Methods for modular reduction using "special" primes that define $GF(p)$ and $GF(2^n)$

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. Continuing the section on implementation details, we discuss one method of implementing efficient modular reduction: using "special" primes modulo.

As we have seen in the previous blogs, when implementing cryptographic schemes, one of the mostly frequently invoked operation is the modulo operation. Unfortunately, despite its massive usage modulo cannot be performed easily like some other arithmetic operations such as additions and multiplications. Montgomery representation provides one solution to this problem, and we will discuss another solution, Psudo-Mersenne Prime reduction.

Definition: A prime number $p$ is called a Psudo-Mersenne Prime if $p$ can be written in the form:

p = b^n - c

where

0 < |c| < 2^{\lfloor n/2 \rfloor}

Practically, $b$ is always $2$ and we pick $c$ within a word size which is usually $32$ or $64$ bits.

It can then be easily derived from its definition that

p \equiv b^n - c \equiv 0 (\mod p)\\
b^n \equiv c (\mod p)

Therefore given an integer $z$ of $k$ bits, we let $z'$ be its Least Significant $n$ bits and $z''$ its Most Significant $k-n$ bits, i.e. $z = z''2^n + z'$, we can then rewrite $z \mod p$ as:

z \equiv z''b^n + z' \equiv z''c + z'  (\mod p)

The modulo of $z$ can then be computed by recursively apply this until it yields in $\mathbb{Z}_p$.

There are several points might worth mentioned here:
1. Both $z'$ and $z''$ can be computed efficiently by simply shifting $z$.
2. Since $c$ is picked within a word size, the multiplication can be done very efficiently.
3. Each iteration shrinks the left hand side $k$ bits to the right hand side $max(k-n+w, n)$ bits where $w$ is the word size.
So generally speaking, the reduction can be done very efficiently when the modulo is a Psudo-Mersenne Prime as it only requires shifting, addition and multiplication.

Nevertheless, drawback of using such method is also obvious as such implementation will usually requires multiple parties to use a fixed setup which will potentially results into both interoperability and security problems.

$GF(2^n)$ is another filed of common usage in cryptography. Trinomial and pentanomial are the mostly used modular in this scenario. We will show how  trinomial simplifies the reduction. Same technique can be applied to pentanomial straight forward.

The idea is very similar to the prime field one. Presume we have the trinomial modulo $f(x)$ such that

f(x) = x^n + x^t + 1

where $0 < t < n/2$.

We immediately have

x^n \equiv x^t + 1 (\mod f(x))

Given a polynomial $z(x)$ with degree greater to $n$, we rewrite $z(x)$ as

z(x) = z''(x) x^n + z'(x)

where $z'(x)$ is the polynomial represented by the Least Significant $n-1$ bits of $z(x)$ and $z''(x)$ the others.

Then just like what we have done on $GF(p)$, we compute the modular by

z(x) = z''(x) x^n + z'(x) \equiv z''(x) (x^t + 1) + z'(x)\\
\equiv z''(x) x^t + z''(x) + z'(x) (\mod f(x))

This can be done very efficiently as $t$ is a "small" number.

[2] also described another optimization to the standard reduction. Consider during a standard procedure of reducing $z(x)$ of degree $m$ to $f(x)$ of degree $n$:

z(x) = a_m x^m + a_{m-1} x^{m-1} + ... a_1 x^1 + a_0 x^0\\
f(x) = x^n + x^t + 1

When we try to reduce $a_ix^{i}$, there are two cases:
• If $a_{i} = 0$, then nothing will be changed to the remainder, or
• If $a_{i} = 1$, then $1$ will be added to the bits aligned to $x^t$ and $1$ in $f(x)$, namely $a_{i - n + t}$ and $a_{i-n}$.
Since adding $0$ does not change the remainder, these two cases can be generalised to one; therefore we can write the standard reduction procedure as:

INPUT: $z(x)$
OUTPUT: $z(x)$
1. for $i = m$ to $n$ by $-1$
2. {
3.     $a_{i - n + t} += a_i$
4.     $a_{i - n} += a_i$
5. }
6. return $z(x)$

The advantage of using such algorithm does not seem obvious from a software perspective; however, it can be implemented very efficiently by hardware as it simply updates $z(x)$ and requires no extra storage.

Another advantage is that such code requires only $0 < t < n$ and can be executed in a constant time.

[1]Menezes, Alfred J., Paul C. Van Oorschot, and Scott A. Vanstone. Handbook of applied cryptography. CRC press, 1996.

[2]Blake, Ian F., Gadiel Seroussi, and Nigel Smart. Elliptic curves in cryptography. Vol. 265. Cambridge university press, 1999.