Friday, March 23, 2012

Study Group: Integrated Payment Environments

Today's study group was on the recent European Commission Green Paper: "Towards an integrated European market for card, Internet and mobile payments" [link], presented by Stefan and Phil.

The paper highlights the increasing movement towards e-Commerce and the mobile operation of EU citizens and business outside of their country of origin. Already, the Single Euro Payments Area (SEPA) initiative is working to remove the distinction between cross-border and domestic retail payments in Euro across the EU. At the same time, much effort is being made towards innovative payment methods for use online and via smart phones; examples include e-wallets and virtual public transport tickets. There is a pressing need -- and opportunity -- to reach a useful, practical and integrated notion of "making a payment" which will consistently serve the interests of consumers and business alike in the European-wide economy.

The particular drivers behind such an integrated payment environment are:
  • Increased competition, stimulating more cost-effective payment solutions and mitigating market dominance by "the two existing international card schemes" (i.e. Visa and MasterCard).
  • Increased choice and transparency for consumers: currently, costs are indirectly passed down to payment users, and they do not have the requisite information to chose the most efficient instruments.
  • Stimulus for innovation through market forces (economies of scale, etc).
  • Improved security and consumer trust.

The stated objective of the Green Paper is to "...launch a broad-scale consultation process with stakeholders to validate or contribute to the Commission’s analysis and to help identify the right way to improve market integration". With that in mind they set out some of the particular challenges identified in current payment solutions.
  • The market is extremely fragmented, because local, custom solutions have been developed to order before integration was identified as a priority.
  • The costs of different payment methods are hard to understand, both for merchants and customers; they are often bundled together in a way which is not transparent. Consumers are not currently being steered towards preferred (i.e. more cost-effective) payment services.
  • Merchants (e.g. Amazon) need to use different banks (with different procedures) to process payments from different geographical areas.
  • New providers are prevented from entering the market by the difficulty of communicating with existing providers and the costs of certification -- which must be done separately for each country because of localised standardisation. (Centralised certification would make it much cheaper and quicker to roll out new payment solutions).
  • Banks have commercial incentives not to communicate. This leads to multiple solutions to the same problems, which confuse customers and compromise security (consequently deterring use).

From a slightly different perspective, the proliferation of illegal websites and stores is another problem which could be tackled via an integrated payment environment, as refusing access would cut off money supply. This could be a useful mechanism for bringing down such operations, although the Commission also recognise the potential for such powers to be abused.

We then looked at a particular card-not-present authentication scheme (widely used for online transactions) which has already received criticism in a 2010 paper by Murdoch and Anderson, entitled "Verified by Visa and MasterCard SecureCode: or, How Not to Design Authentication" [pdf]. Some of the identified weaknesses of the 3-D Secure protocol are:
  1. It confuses the user by initiating a sensitive dialogue in an i-frame where the https assurances (which we are all well-trained to look out for) are not communicated. Some banks even request a PIN for authentication -- which a security-conscious user should be very wary of typing into a website.
  2. It creates a point-of-sale inconvenience, particularly on activation (which happens the first time you use your card online -- rather than by prior notification from your bank, as would be preferable). By interrupting the transaction, it incentivises poor password choice.
  3. It shifts liability onto the customer -- and by presenting terms and conditions at point-of-sale it rather steers the customer towards accepting this liability without due consideration.
  4. It offers little authentication from the bank to the customer, besides a 'memorable phrase' entered by the user during set-up (which, similarly to point 2, may have been poorly chosen in an impatience to return to the transaction). This phrase, moreover, is found to be vulnerable to a man-in-the-middle attack.
The authors' assessment is that 3-D Secure was chosen for economic reasons, precisely because of the liability shift. They propose that the technical solution to a secure way forward should involve transaction authentication (rather than single sign-on), either via SMS messaging or by issuing personal hardware devices similar to CAP readers.

In the discussion which followed, we focused on the question of how to foster the development of sound security mechanisms, and what role should or could academia play. We identified, as a particular problem, the mis-aligned incentives between banks and customers, with the former tending to prefer solutions which shift liability on to the latter -- a primary driver for the adoption of the 3D-Secure protocol for online payments.

If banks can reduce liability they have less incentive to prioritise security. Competition between payment system providers is a further obstruction: current best-practice seems very much to favour open development of cryptographic schemes, so that candidates for standardisation are thoroughly and publicly scrutinised (for example, the open NIST competitions towards standards for block ciphers and hash functions). Where cryptography is developed secretly/proprietorially it does not benefit from the combined wisdom of the research community as a whole, and is often found to be weak as a consequence (take MIFARE, for an infamous example).

As such, the feedback from bodies like the Card Stakeholders Group should be carefully scrutinised by those with security expertise. We discussed the possible usefulness of a European-wide standardisation body (or maybe just a limited-time initiative), inspired by NIST but with a focus on protocols and not just primitives.

We also considered the user-side problem -- inherent with any security measure -- of the incentive to by-pass good practice in order to make life simpler, for example by choosing weak passwords, or 'switching off' secure settings which complicate or slow down a system. Designers need to take more responsibility for ensuring that security aligns with user needs and understanding as far as possible, and not just shift blame for user-introduced weaknesses by default. Hence, technical expertise needs to be accompanied by sociological and psychological insight so that human factors are taken seriously in the design of new payment processes, otherwise the types of flaws observed in 3D-Secure will persist in future proposals.

Thursday, March 22, 2012

TCC, Wednesday PM


On the afternoon of the last day there were two sessions on encryption with extended functionality and pseudorandomness. Those talks dealing with randomness introduced interesting concepts. For instance the one given by B.Fuller, 'A Unified Approach to Deterministic Encryption: New COnstructions and a Connection to Computational Entropy' (with largest title in the conference) explained how to use the message to be encrypted as a source of randomness in deterministic encryption. Security is defined by imposing to the message source M to have high minimum entropy H(M)=u, so that the
probability M outputs a particular message m is less than 1/2^u. This means that even the most likely message is hard to guess.

Nevertheless the talks I enjoyed the most were the first two ones.
Subspace Learning with Errors (SLWE) was presented by K.Pietrzak (IST, Austria). LWE is a cryptographic primitive related to lattice problems introduced by Regev  in 2005. Since then it has been widely studied, and used as a building block in several schemes. Most of the newest FHE schemes base security on it. It works as follows: we have two worlds, in the ideal world we are given samples (a,b), where a is  an integer vector of n coordinates living in Z_q, and b is a value randomly sampled from Z_q. In the real world b is replaced by <a,s>+e, where <,>  denotes the inner product, s is a secret vector and e is a small value in Z_q randomly sampled from some error distribution. The goal of the attacker is to distinguish between the two worlds. Regev gave a quantum reduction to well-known hard lattice problems, (there also exist a classical reduction but leads to less practical schemes). LWE  is a generalization of LPN (learning parity with errors) and their hardness are polynomial equivalent. The SLWE differs from LWE in that an attacker trying to learn the secret s from the public key, can adaptively query LWE samples: the new oracle now receives as inputs a and an Z_q-affine transformations A, B. The oracale's output is (<A(a),B(s)>+e>). The matrices defining A, B must have rank > n to avoid trivial attacks. Not surprinsingly if SLWE is hard then so is LWE (the oracle used in LWE receives as input the identity map). What is more interesting and Pietrzak proved  in his paper is the converse. He also gave two applications. The first one is an improvement of this protocol which only achieves security against passive adversaries. Using SLPN instead of LPN he managed to show that you can get security under active attacks. Most importantly, he noted that any secure scheme based on LWE  remains secure under related-key-attacks (where the adversary can query not only for decryptions  under secret key s but also for secret keys s'=f(s), for some function f within a given family). More precisely schemes are secure under RKA respect to affine functions on s.

The second talk was given by David A. Wilson (MIT) presenting this paper. He explain how bounded collusion IBE schemes can be constructed using homomorphisms. The main ideas are the following ones. 1) Key homomorphism: assume secret keys are tuples of elements in some ring R. We have homomorphically structured keys, if any R-linear combination of key tuples gives raise to a valid secret key, whose public key is efficiently derivable from the public key of the operands. 2) Identity map compatibility is a function on n ID users tags outputting n linearly independent vectors in R^n . 3) Linear related key semantic security  is a type of security model where semantic security holds even if the challenged secret key is a linear combination of fixed secret keys values. The attacker can query for any other linear combination of secret keys values, as long as is independent of the challenge. My understanding of introducing this (in somehow artificial) new security type is to deal with homomorphic images of fixed secret keys. Any cryptographic scheme holding these three properties can be used to construct BC-IBE schemes. He gave an explicit scheme based on this scheme, whose underlying hard assumption is the Quadratic Residuosity problem.

Wednesday, March 21, 2012

TCC 2012: Day 2 AM

On Tuesday morning, there were three sessions about leakage resilience, hash functions, and differential privacy.

The first talk presented a way to compile a circuit such that the result is leakage resilient in the bounded independent model. In this model, the memory is split into parts, and as many bits as a given fraction of the key length can be leaked from only some parts in a certain period. The compiler creates a two-party protocol that computes the circuit, where every player and a communication resource are assigned a part of the memory. In every step of the protocol, leakage can occur from one player and the communication resource. It is also assumed that there is a leakage-free source that gives correlated bits to the players. Every wire in the circuit is represented by one random bit string for each player such that the inner product of the two strings is the bit on the wire. The leakage-free source is used to generate these two strings. If the first bit of string of the first player is always 1, the negation can be computed by flipping the first bit in the string of the second player. For an AND gate, the players can locally compute the multiplication of all n^2 combination of bits of their strings. The inner product of the results then is the AND of the input bits. However, the output strings are not random and too long. Running a randomization, publicly computing the inner product of the excess string, and then negating if the result is 1 brings the strings back to the desired form.

The second and the third talk of the leakage resilience session covered leakage resilience amplification by repetition and the universal composability with leakage resilience, respectively.

The first talk in the hash function session discussed Feistel networks with random oracles. For someone not familiar with hash functions, it was interesting to see that the security does continuously increase with the number of rounds but, instead, there is a minimal number of rounds to achieve certain a certain security definition. The topic of the second talk was a hash function based on a blockciphers with a connection to discrete geometry.

The final session of the morning was on differential privacy. This concept tries to ensure that the answer to database queries do not give away sensitive information about an individual entry. A possible way to solve the problem is to add noise to the answers. In the first talk of the session, lower bounds on this noise were presented. Another way to achieve differential privacy covered in the last talk. A sanitizer keeps a local approximation to the database that is empty in the beginning. For every query, the sanitizer queries the real database in a noisy way and its local approximation. If the two answers are close enough, the answer of the approximation is returned, and otherwise, the approximation is updated and the noisy answer of the real database is returned. After some approximation updates, the approximation is close enough, and from then on, all queries will be based on the approximation. If the noise used earlier was big enough, no query will reveal sensitive data. This technique of approximation a database is called iterative database construction.

Tuesday, March 20, 2012

TCC 2012: Day 1 PM

The afternoon of the first day contained several interesting talks, however I particularly enjoyed the two talks on non-interactive zero knowledge (NIZK), so will aim to summarize those in this post.

Jens Groth gave the first invited talk of the conference, an hour-long lecture on NIZK. In such a protocol, a prover who knows a particular statement wishes to provide a verifier with a proof of this statement. The non-interactive requirement means that this must be done with just a single message. As with standard zero knowledge, the completeness, soundness and zero knowledge requirements must be satisfied. Jens focussed on NIZK in the Common Reference String (CRS) model, where both the prover and verifier have access to a shared string to allow for non-interactivity.

To prove an arbitrary statement in NP, it suffices to give a proof for an NP-Complete problem such as the Circuit Satisfiability problem (SAT). The basic idea behind doing this is to use a homomorphic commitment scheme. The prover then commits to the satisfying inputs of the circuit, and also to the intermediate values on every wire of the circuit. For each gate, the prover must then provide proofs of correctness of the commitments.

Groth focussed on techniques that require standard assumptions; either trapdoor permutations or factoring and pairing-based assumptions for more efficient proofs. These all lead to proofs of linear or quasi-linear length in the circuit size.

Despite being a theoretically nice way of proving any statement in NP, this method would be highly inefficient for proving statements other than SAT due to high constants in the reduction from most NP problems to SAT. It would seem, then, that for any particular language one has to come up with a new NIZK proof to overcome the efficiency barrier. Groth described a compromise: the pairing product equations problem is to find x,y satisfying:
e(a,x)e(x,y) = 1
e(b,y) = 1

when given group elements a,b. This has a direct application to group signatures and many other pairing-based languages, yet is also NP-Complete so can be used to create a proof for any NP language.

A complementary talk was given later in the afternoon by Helger Lipmaa, detailing more efficient NIZK proofs under slightly stronger assumptions. He avoids having to use random oracles, but requires pairing-based assumptions and knowledge extractors. To prove the validity of the commitments, he describes Hadamard Product and Permutation zero-knowledge arguments, which are highly parallelizable and can be
used to prove circuit satisfiability. The CRS length and prover computation time are both reduced from quadratic to quasi-linear.

Further Reading:
Kilian & Petrank '98 (original proof for NP using hidden random bits)
Groth & Sahai '06 (proof for NP using pairing product equations)

TCC : Tuesday PM

Draging oneself away from the lovely views of Mount Etna in Taormina to go to a bunch of talks on cryptography can be difficult. But the talks this afternoon made it worthwhile. There were a total of eight talks, so I only select a few highlights (mainly from the first part on PRFs).

The first two talks on constructing PRF's I thought were particularly well delivered. The first by Berman and Haitner looked at the problem of constructing an adaptively secure PRF (i.e. the adversaries queries are determined adaptively) from a weaker PRF (namely a non-adaptive PRF) F and a pairwise independent hash function H. The speaker showed that this could be done by simply applying H and then applying F; the hash function however needed to be restricted to have co-domain the size of the "running time" T of an adversary that could break F with probability 1/T. 

The second talk by Jain, Pietrzak and Tentes took a different approach to constructing a PRF namely from a PRG G whose co-domain is twice that of the input size. The trick was quite neat. Firstly they look athe GGM construction which compute PRF(k,x) in two steps:
  •  First write G(y) as two functions G_0(y)||G_1(y) where G_i has co-domain n-bits in length.
  • Write x in binary as x_n,....,x_1
  • Output G_{x_n}(G_{x_{n-1}}(....G_{x_1}(k)))
Call this function GGM(k,x). This function requires we call the function G n-times to evaluate it on a string of size n bits.  A trick of Levin is to first apply a hash function with co-domain m-bit strings so as to compute the function  GGM(k,H(x)).  Which requires m calls of the PRG G, although the key-size of the PRF increases since one needs a key for the hash function H (I have hid this in the notation above).

The problem is that this has problems due to collisions in G. To get around this the authors propose the following construction

H'(GGM(k,H(x)),x)

where H' is different a keyed hash function with key GGM(k,H(x)) and message x. I thought this was rather neat myself.

The short mini-session on PRFs was completed by a talk by Krawczyk (co-authors Dachman-Soled, Gennaro and Malkin) who looked at various theoretical aspects, but at the end concluded that the PRF used in the TLS implementation of Diffie-Hellman key exchange is secure. This was much to Hugo's "distress" as he did not think it should be morally secure at all, so it was a surprise when it was indeed shown to be secure. 

For the other talks I had seen many of the papers before, so I was not so surprised or interested (such is the effect of e-Print on conference attention spans). However, the "Verifiable Computation from ABE" paper is definitely worth a look for people interested in this area. The key contribution is that they produce a VC scheme without needing FHE or PCP's (both being rather inefficient). They do this by utilizing an ABE. However, the efficiency of their own solution is still not really useful, since the scheme is based on functions with single-bit outputs, to extend the technique to other bit-lengths one needs to keep repeating the technique. Given the underlying techniques are ABE, which themselves are not that (practically) efficient, there seemed a lot of room for further work in this area.

Monday, March 19, 2012

TCC-Day 1, session 1: MPC

The first talk of the session, Computing on Authenticated Data, defined notions for deriving signatures on messages for which a certain predicate is satisfied. There has been much progress in this area recently with work on redaction and quoting of authenticated data, homomorphic signatures and transitive signatures. This can be seen as a signature analogue of fully homomorphic encryption in some sense. One particular new property identified in this work was context hiding: a derived signature on a message m' should not leak any information on the original source message, for example a signed quote should not leak any information about the message from which it was quoted. This property can be seen as a data analogue of the hiding a signer amongst a group of users in group signatures.

The second talk focused on constructions of identifiable secret sharing schemes. This is a SSS which during the reconstruction phase either outputs the correct secret or publicly identifies the set of all cheaters. In this work they modify the reconstruct algorithm to only inform the set of honest players of the cheaters (which is arguably a more natural requirement). Under this new notion they show unconditional MPC can be realized in the presence of a dishonest majority.

The final talk of the session given by Ivan Damgaard focused on two party secure computation with low communication. In this setting two parties with private inputs wish to compute some function of those inputs with low communication costs. Famously Yao showed that any poly time two party functionality can be computed securely even against a malicious adversary. For semi-honest adversaries there exists a straightforward solution based on FHE and NIZK proofs. In this work the authors succesfully build a protocol where the communication is only polylog in the circuit size, assuming the existence of collision resistant extractable hash functions, which can be instantiated via the knowledge of exponenet assumption.

Friday, March 16, 2012

Oblivious RAM

Today's study group was on Oblivious RAM - a concept introduced by Goldreich and Ostrovsky, which enables a client storing only a small amount of data locally to store a large amount of data in e.g the cloud and access them while hiding the identities of the items being accessed.

Roughly speaking the adversary model for Oblivious RAM is as follows - an adversary can't distinguish between equal length sequence of queries made by the client to the server. Early solutions required the server to store O(N*log(N)) data items and each access to a data item actually required O(log^3(N)) data requests. Later work of Pinkas & Reinman brought this down to O(N) storage and O(log^2(N)) requests. However their solution was insecure but an improvement later made this secure. Interestingly there is a theoretical lower bound on the number of requests of log(N). Essentially the overhead arises out of which data oblivious sorting algorithm one chooses.

We focus on the quadratic solution which is conceptually the simplest to understand. In essence we have a memory cell of adressess [0,N] which we expand by an extra sqrt(N) which will hold an extra sqrt(N) dummy data. We also have a "shelter" of size sqrt(N). The first step is to randomly permute all locations [0, N + sqrt(N)] (i.e everything outside the shelter) using a permutation derived from a random oracle. We carry out sqrt(N) memory accesses as follows: previously accessed memory is kept in the sheltered locations. An access to the original RAM works as follows: say we wish to access memory at location i. First we scan through the shelter and check whether the contents of the original memory is held at one of the locations [N+sqrt(N)+1,N+2*sqrt(N)]. If it's not there its at location pi(i) where pi is our permutation. If it is there, we access the next dummy word (i.e if we are on the j'th access for 1<= j <= sqrt(N) we use the dummy at pi(m+j)). For further details see http://dl.acm.org/citation.cfm?id=233553

The significantly more efficient "heirarchical" approach is a lot more involved. The interested reader should consult Goldreich & Ostrovsky paper. Also of interest is the multiparty approach of Damgaard et al. eprint.iacr.org/2010/108.pdf which removes the need for a random oracle.

Jake

Friday, March 9, 2012

Study Group: Computing models. Sublq etc.

The study group on Mar 9th 2012 was done by Simon and Mike. The talk focused on computing models, sublq etc.

Subleq-OISC:

OISC stands for “one instruction set computer” – analogous to the well-known CISC and RISC abbreviations. The OISC instruction “set” indeed consists of only one instruction, but is nonetheless Turing-complete1. This instruction takes three parameters and is concisely described as “subtract and branch on

result less than or equal to zero,” abbreviated subleq a b c.

In “C” style pseudocode, this instruction does the following:

*b-= *a;

if (*b <= 0) goto c;

The followings are instrucions of some operations:


CLR a

subleq a a $+1


JMP c

subleq Z Z c


MOV a b

subleq b b $+1 *b=0

subleq a Z $+1 Z=-*a

subleq Z b $+1 *b=0-(-*a)=*a

subleq Z Z $+1 Z=0


ADD a b c

subleq a Z $+1

subleq b Z $+1

subleq c c $+1

subleq Z c $+1

sublez Z Z $+1


SUB a b c

subleq t t $+1

subleq s s $+1

subleq a t $+1

subleq b s $+1

subleq s t $+1

subleq c c $+1

subleq s s $+1

subleq t s $+1

subleq s c $+1


DIV abc

MOV b v

MOV a w

CLR c

subleq N c $+1

subleq w v $+4

subleq Z Z $-8

The Ultimate RISC :

The only instruction in the ultimate RISC `instruction set' is a two address instruction that moves a single word from one memory location to another. The instruction execution unit of such a machine can hardly be considered to be a processor, since it performs no computations; thus, the term central processing unit is inappropriate here. If such a machine is to be useful, it must, of course, be able to compute; this problem can be solved by adding a memory mapped arithmetic unit.

This architecture is not intended to be a practical architecture. However, the architecture can be made reasonably convenient with the addition of a small amount of extra hardware. For example, the attachment of a memory management unit between the instruction execution unit and the memory allows for indexing, and one additonal flipflop is enough to allow convenient conditional branches. Furthermore, this architecture may be pipelined, and doing this introduces all of the problems of instruction scheduling traditionally associated with RISC systems.

A minimal implementation of the ulitmate RISC architecture requires an instruction execution unit, a memory, input-output devices, and a bus to connect these. The basic algorithm executed by the instruction execution unit is most easily expressed if a memory address fits exactly in a word. This has been assumed by the register transfer description of the instruction execution unit shown as follow:

registers pc, addr, temp;


repeat
1: addr := M[pc]; pc := pc + 1;
2: temp := M[addr];
3: addr := M[pc]; pc := pc + 1;
4: M[addr] := temp;
forever;

The condition code tests shown in the following table are based on those available on machines descended from the PDP-11, where N signifies a negative result, Z signifies a result of zero, O signifies a 2's complement overflow, and C signifies a carry out of the sign bit. The representation of Boolean values produced by the arithmetic unit is determined by the need to perform conditional branch instructions. True is represented by the value 2, and False is represented by zero.

Table : Arithmetic Unit Operations
 



address 
write  
read
FFF0
acc = data 
data = acc
FFF1 
acc = acc-data  
data = N
FFF2
acc = data-acc 
data = Z
FFF3
acc = data+acc
data = O
FFF4
acc = data^acc 
data = C
FFF5
acc = data&acc
data = N^O
FFF6  
acc = data|acc 
data = (N^O)|Z
FFF7
acc = data>>1
data = C^not(Z)



Saturday, March 3, 2012

CryptoForma Edinburgh


Last talk at the CryptoForma meeting held in Edinburgh was given by Nadhem AlFardan from Royal Holloway.
He presented the work titled "Plaintext-Recovery Attacks Against Datagram TLS" that received the best paper award at NDSS this year.

Datagram TLS (DTLS) is a TLS extension suitable for datagram transport. It was introduced in 2004 and its implementations can be found either in OpenSSL and GnuTSL.

The attack shown is a padding oracle attack. Informally, padding oracle attacks work by adjusting a ciphertext in some specific way before input it to an honest decryptor. If a padding error is returned, then the adversary is able to gain some information about the underlying plaintext.
For instance, the padding oracle attack mounted against TLS in OpenSSL introduced by Vaudenay, makes use of timing differences due to the fact that messages with bad padding were rejected faster than other messages.

Since DTLS does not return error messages when processing invalid padding, implementation of DTLS should be resistant to such kind of attacks.

Interestingly, the vulnerability presented by Nadhem shows that such attacks are still possible by exploiting the timing of heartbeat response messages or that of any responses with a predictable delay sent by any upper-level protocol instead of error messages. The work also introduces techniques to amplify the timing differences which makes use of "trains" of valid or invalid packets.

Friday, March 2, 2012

CryptoForma Edinburgh

Yesterday and today saw two days of CryptoForma in Edinburgh, discussing topics including voting (where I gave a talk on our Helios work) and the latest approaches to formal verification.
A topic that spanned many talks on different topics was the difficulty of verifying exactly what a protocol does - often a tiny difference down at the bit level between a security model and what the protocol actually does can make all the difference.

A very promising solution to this problem is to verify security properties on the actual source code, as presented in an excellent tutorial by Cédric Fournet today using the F#/F7 languages.
The properties that can be expressed here are not "just" a formal model that would need a computational soundness theorem to link it to the actual code but actual cryptographic properties like IND-CPA or CCA2.

Even MS research can't prove AES to be CCA2 (and as some would no doubt point out, an asymptotic result means only so much for n=256) but one can take a modular approach: assume that the cipher in the System.Cryptography library is CCA2 and then prove an RPC protocol secure under this assumption, using something that looks suspiciously like a game hop. Except that this is all done automatically in visual studio at compile time.

The key ingredient here are refined types, comprising a base type and predicates on these that can be verified by a theorem prover. For example, "block: bytes {length = 8}".
Furthermore, one can set predicates in the execution of a primitive, for example calling MAC(key, msg) sets the implicit predicate auth(msg) and VERIFY(key, msg, mac) has the postcondition "VERIFY(key, msg, mac) = 1 => auth(msg)", which models existential unforgeability.
Where this gets really interesting is when such predicates are set at the protocol level: auth(msg) is set not when MAC is called but when the message msg should logically/semantically be authenticated by a key-holder. The same postcondition on VERIFY can now be checked relative to the whole protocol, say a client-server RPC protocol.

Of course even this technique can't guarantee secure protocols in all settings (timing side-channels are not covered, nor malware that interferes with the checked program at runtime) but it would still be a great step in the right direction if compile-time formal modelling and verification became as natural a step in protocol implementation as unit testing.

Study Group on Attacks on Software

The study group on March 2, 2012, 10:00am was done by Valentina and Georg (and obviously scribed by me). The main theme was attacks on software, with four papers on various topics being presented.

The first paper presented by Valentina (Cappos et al., A look in the mirror: attacks on package managers, CCS 2008) examined the security of software package repositories. A software package contains the actual files of the software, metadata on dependencies und the package's functionality, and optionally a signature. Additionally there might be "root metadata" which contains the metadata of multiple packages and an optional signature. Multiple software packages are hosted on a repository (which is usually authorative) and a number of (non-autorative) mirrors. Clients run package managers, which collect the required packages for installation.

Four different security modes have been evaluated in the paper:
  1. No security
  2. Signatures embedded with package
  3. Signature on detached metadata
  4. Signature on root metadata

The threat model for the security evaluation assumes that the attacker can communicate (e.g. via MITM or a hijacked mirror) with the package manager running on client under the following conditions:

  • Attacker can serve arbitrary files
  • Attacker does not know in advance which package will be requested
  • Attacker does not have the signing key used for signing packages, metadata, etc.
  • Attacker has access to outdated packages, outdated package metadata, and outdated root metadata
  • Attacker is aware of vulnerabilities in outdated packages and is able to exploit those
  • Attacker does not know vulnerabilities in latest version of packets
  • If packet manager supports signatues, they are used
  • If supported, expiration times in root metadata are used

Various attacks have been evaluated for the four security modes:

  • Sending an arbitrary package
  • Replay attack (sending an older package version)
  • Freeze attack (making newer package versions invisible)
  • Extraneous dependencies (rewrite of metadata)
  • Endless data (an endless data stream served by the attacker)

The results of the evaluation are that all security modes are vulnerable to at least some of the attacks. For sufficient security, several security mechanisms should be comined.

The second paper (presented by Georg) was Checkoway et al., Are Text-Only Data Formats Safe?
Or, Use This LaTeX Class File to Pwn Your Computer, LEET 2010.

The authors show how varios LaTeX/BiBTeX constructs could be used to infect machines with malware. The first example was the LaTeX \write command which allows (depending on the underlying platform) write access to the file system. On a MiKTeX installation on Windows this could allow access to any file, while on Linux, only files in or below the current directory with appropriate permissions would be subjected. A similar problem is posed by the LaTeX \read command, which allows to read from the file system. These threats could be partcularly harmful on servers which offer online LaTeX compilation services.

Countermeasures which scan selectively for \write and \read commands and sanitize them cannot be seen as a sufficient protection, due to the versatility of the LaTeX language. For example, the backslash character could be redefined (e.g. \ -> X) which in turn redefines the commands (Xwrite and Xread), thus defeating simple scanning.

Another possibility for introducing malware is the use of the BiBTeX @preamble{...} command, which inserts its argument into the LaTeX preamble. A virus could even be scattered across multiple @preamble commands to obscure it from scanners.

Also, LaTeX class and style files present another option for pushing malware onto a host, especially if their use is mandated for a specific conference.

Denial-of-service attacks (especially on online LaTeX processing services) could be achieved with "endless loop" constructs like "\loop \iftrue \repeat" and "\def \nothing {\nothing}".

The third paper, Brumley et al, Remote Timing attacks are still practical, IACR Eprint 2011/232, was again presented by Valentina. It examines the timing vulnerabilities in OpenSSL v0.98, which have been reported in CERT VU #36044. The concrete example is Montgomery point multiplication, which is usually immune to timing attacks, but in OpenSSL becomes vulnerable due to some optimization techniques applied in the software implementation. The attack works in two phases by 1) collecting a number of signatures and 2) a lattice attack. In the lattice attack, the attacker takes advantage of the length of the scalar k. For example, if the scalar k has 6 leading zeros, the success probablitliy of the attack is 1 with only 50 signatures.

Georg presented the last paper: Nguyen et al., Can We Trust Cryptographic Software? Cryptographic Flaws in GNU Privacy Guard v1.2.3, EUROCRYPT 2004. GNU Privacy Guard (GPG) is an open-source implementation of the PGP program, capable of encrypting and signing emails. It is deployed with all major Linux distributions and at the time of the writing of the paper, version 1.2.3 was the most current version of GPG.

It turns out that tweaks for efficiency (concerning the generation of random parameters) in the software create vulnerabilities. This concerns ElGamal signatures (selectable in "expert mode" over the default of DSA). The attack is very efficient, requiring only one signature and one second of CPU time. As mitigation, the use of DSA is suggested by the authors.