*This is the latest in a series of blog posts to address the list of*

*'52 Things Every PhD Student Should Know'*

*to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This next blog continues the topic of 'Cryptographic Implementation Details'.*

In this post I will aim to compliment
what we saw last week regarding the more theoretical aspects of Montgomery
arithmetic with a practical implementation of it. The implementation is written
in C and written for a computer with a word size of 64 bits. The moduls

You will remember from the last
blog post, that four steps were given to the algorithm (please see post for a
full description of the algorithm if this is hazy). For the purposes of our implementation,
we will eamine each of these stages separately.*m*can therefore be as large as**2**and^{64 }– 1**a**and**b**can be as large as**m – 1**. We will take**r = 2**. As in the previous post, most of what is given here is derived from [1] so please refer to this for more information.^{64}**1. The GCD Operation**

**r**and

^{-1}**m’**such that

**rr**. These integers are required for the subsequent steps of the Montgomery reduction. The algorithm takes

^{-1}= 1 + mm’**r**and

**m**and pointers to values

**r**and

^{-1}**m’**which the algorithm derives. This is done using the

**extended gcd algorithm**which could be implemented in a number of ways (the purpose of which this blog is not about) and I refer the reader to [1] or [2] for detailed descriptions of it.

**2. Transform the Multipliers**

**abar = ar mod m**and

**bbar = br mod m**. As

**r = 2**, no multiplication is required but only a shifting by 64 bits (due to our selection of

^{64}**r = 2**), giving a 128 bit output with 64 0s tailing the value of

^{64}**a**or

**b**. The remainder of the division by

**m**is then given as abar or bbar. A function which takes the high 64 bits (

**x**) and the low 64 bits (

**y**) and the value for m (

**z**) and returns the 64 bit result could be written to do this. Such a function could be defined as follows:

*uint64 modul64(uint64 x, uint64 y, uint64 z);*

Where uint64 is a type defined as follows:

*typedef unsigned long long uint64;*

**3. Montgomery Multiplication**

**abar**,

**bbar**,

**m**and

**mprine**and returns a 64 bit value for the output of the Montgomery multiplication step.

The first sub-stage to of the function is to calculate

**t = abar*bbar**. This is done by multiplying

**abar**and

**bbar**together to give a 128 bit product. An additional function could be written to do this which takes

**abar**and

**bbar**and returns two 64bit values, one with the high 64 bits (

**thi**) and one with the low 64 bits (

**tlow**).

The next stage is the computation of

**u = (t + (tm’ mod r)m)/r.**Here

**t**is a 128 bit integer and

**m’**is a 64 bit integer. As we mod by r, only the lower 64 bits of tm’ are required, meaning that we can disregard the top 64 bits of

**t**.

*tm = tlo*mprime; // Disregard thi*

*mulul64(tm, m, &tmmhi, &tmmlo); // Function which performs 128 bit multiplication*

The subsequent multiplication by m gives an output of 128 bits and the
addition of

**t**an output of 129 bits, which can be carried out as 128bit + 128bit = 128bit output and compute the carry separately. In C:*ulo = tlo + tmmlo;*

*uhi = thi + tmmhi;*

*if (ulo < tlo) uhi = uhi +1; // test for overflow from ulo and add if necessary to uhi*

*ov = (uhi < thi) | ((uhi == thi) & (ulo < tlo)); // check for carry*

The last step is the calculation
of

**if(u >=m) then return u – m; else return u**. This is shown below:*ulo = uhi;*

*uhi = 0;*

*if(ov > 0 || ulo >= m) // test if there was overflow or ulo is higher that*

*ulo = ulo – m;*

*return ulo;*

**4. The Inverse Transformation**

**ur**mod m which is the product of

^{-1}**a and b modulo m**. This could be done by calling the following functions:

*mulul64(p, rinv, &phi, &plo); // performs multiplication and returns two 64 bit values phi and plo*

*p = modul64(phi, plo, m); // returns value of 128bit input mod m*

This gives the output

**p**which is the 64 bit output equal to ab mod m and the Montgomery reduction is complete.[1] http://www.hackersdelight.org/MontgomeryMultiplication.pdf

[2] http://www.ucl.ac.uk/~ucahcjm/combopt/ext_gcd_python_programs.pdf

## No comments:

## Post a Comment