One of the highlights of the last week in Zurich was undoubtedly Rado Sion's (http://www.cs.sunysb.edu/~sion/) talk on the economics of cloud computing in relation to security. Measuring the cost of outsourcing in the cloud as cost of a single CPU cycle in "picocents", that is 10^-14$ allows one to weigh up the real economic cost of outsourcing. In particular, one can ask how many additional cloud cycles can we spend on cryptography, before its outsourcing becomes too expensive? Most talks after this, had members of the audience asking this very question. In particular, novel encryption schemes such as functional encryption may be very cool but aren't going to come cheap!
In 2009 Craig Gentry's construction of the first fully homomorphic scheme (i.e a scheme which supports unlimited computation on encrypted data) was hailed as a major theoretical breakthrough due to its obvious implications for secure computation in untrusted clouds, but could such a scheme ever be economical for anyone to deploy? Much of Sion's research has focused on answering questions like these in a far more concrete fasion than many cryptographers and cloud advocates would normally dare. The conclusions make fascinating reading. For example, http://www.cs.sunysb.edu/~sion/research/sion2010wpes-pcost.pdf analyses the cost (again, in picocents) of a variety of cryptographic schemes for home users up to large size data centres such as those provided by the Amazon cloud. Figures are calculated using the ECRYPT benchmarking (http://bench.cr.yp.to/) for AES, RSA encryption and DSA, ECDSA signatures. The authors also give figures for transferring data into the cloud and storage costs. Armed with these results they analyse various scenarios of outsourcing, namely simple storage such as that offered by Amazon S3, searching on encrypted data and secure SQL queries in the cloud. Suppose a user wants to store (but not compute on) data held in the cloud. Based on their figures they predict outsourced storage can be upwards of 2 orders of magnitude higher than local storage, even in the absence of security assurances. Searching on encrypted data turns out to only be economical in the cloud if the returned result is less than 36 bytes per query (and this doesn't take into account the cost of TCP overheads) and similarly damning conclusions are drawn for SQL queries based on current schemes.
Perhaps ironically, this doesn't imply a highly computationally expensive crypto-scheme such as FHE is economically unviable. We can envisage a scenario where a home user simply didnt have the necessary computational power on her desktop, but required computation (say a simple private information retrieval) from the database. For example, doctors requiring information on patients from a health database held remotely. It may be cheaper for the corresponding health service to provide PIR to their doctors and paying for the extra overhead that comes with outsourcing than maintaining their own IT infrastructure and still have the security guarentees provided by cryptographic schemes.
We conclude that whilst looking at the cost of outsourcing purely in terms of CPU cycles renders it useless for the most part, there are scenario-specific cases where it is economical. Businesses and home users alike will have to take a serious look at whether this is the case for them and not simply be taken in by the frenzied marketing surrounding "the cloud".
A blog for the cryptography group of the University of Bristol. To enable discussion on cryptography and other matters related to our research.
Friday, March 18, 2011
Thursday, March 17, 2011
Cryptography and Security in Clouds (Zurich March 15-16 2011)
"The Cloud" is supposed to let you run a virtual machine and abstract away all details like what physical processors or disks lie beneath. But we can't hope our adversaries will be nice and stick to this ...
In the first talk we saw some examples of what can go wrong from a security perspective when you ignore what lies beneath. If two virtual machines are executed on the same physical machine, we call them co-resident and if our adversary can get his VM co-resident to ours we may be in trouble. If he can take over the physical machine for example by exploiting a bug in the hypervisor (software layer that runs the VMs), he gains complete control of our VM. Even if he's confined to his VM, he can measure the latency of operations and possibly gain a side-channel from the fact that his code uses the same physical caches as ours. Using Amazon's cloud as an example, the speakers showed that it's surprisingly easy to get co-resident with a target of your choice and to detect if this was successful.
Another nice feature of VMs is they can be suspended to a disk image and restarted. A less nice feature is that the same image can be restarted several times, given a different query each time but may use the same randomness/PRG - practical examples were presented as our favourite browsers seed the PRG for TLS and co. when they start up. In other words, we can actually apply the "forking lemma" in practice! For a server VM, we saw practical examples of stealing the TLS master key this way.
Another security issue arises in cloud-based backup systems like dropbox. They use a trick called deduplication - if two users backup the same file, it only stores the file once and the second user doesn't even have to upload it again (the service can compare file hashes). It's pretty much the equivlent of UNIX hard links. It's estimated that deduplication saves 95%-99% of space for online backup systems, but we're still talking about petabytes.
Deduplication gives anyone an oracle to test whether a file has been backed up before and to brute-force test which version of a file has been backed up if only a small portion is unknown. Another problem is that knowing the hash of someone's file allows you to retrieve it - just pretend you have a file with that hash, deduplication means you get access to it without any further questions. As an aside, dropbox claims to encrypt all files with "military-grade encryption" (AES is mentioned too) but for deduplication to work at all, it seems like they encrypt everyone's files with the same key.
With problems like these out there, what can we do? Two main approaches were presented, both dealing with the fact that the client cannot fully trust the could provider. The first is based on secure hardware (mostly TPMs) and several talks gave scenarios for using hardware-based trust. The second approach is to use software-based cryptography - mostly some variation of MPC, although we heard some good arguments why general MPC won't make it into the real world. However, special cases (Voting and Linear Programming were mentioned) offer good opportunities for efficient MPC-like constructions. The principle of using two (or more) clouds also appeared in several talks.
In the first talk we saw some examples of what can go wrong from a security perspective when you ignore what lies beneath. If two virtual machines are executed on the same physical machine, we call them co-resident and if our adversary can get his VM co-resident to ours we may be in trouble. If he can take over the physical machine for example by exploiting a bug in the hypervisor (software layer that runs the VMs), he gains complete control of our VM. Even if he's confined to his VM, he can measure the latency of operations and possibly gain a side-channel from the fact that his code uses the same physical caches as ours. Using Amazon's cloud as an example, the speakers showed that it's surprisingly easy to get co-resident with a target of your choice and to detect if this was successful.
Another nice feature of VMs is they can be suspended to a disk image and restarted. A less nice feature is that the same image can be restarted several times, given a different query each time but may use the same randomness/PRG - practical examples were presented as our favourite browsers seed the PRG for TLS and co. when they start up. In other words, we can actually apply the "forking lemma" in practice! For a server VM, we saw practical examples of stealing the TLS master key this way.
Another security issue arises in cloud-based backup systems like dropbox. They use a trick called deduplication - if two users backup the same file, it only stores the file once and the second user doesn't even have to upload it again (the service can compare file hashes). It's pretty much the equivlent of UNIX hard links. It's estimated that deduplication saves 95%-99% of space for online backup systems, but we're still talking about petabytes.
Deduplication gives anyone an oracle to test whether a file has been backed up before and to brute-force test which version of a file has been backed up if only a small portion is unknown. Another problem is that knowing the hash of someone's file allows you to retrieve it - just pretend you have a file with that hash, deduplication means you get access to it without any further questions. As an aside, dropbox claims to encrypt all files with "military-grade encryption" (AES is mentioned too) but for deduplication to work at all, it seems like they encrypt everyone's files with the same key.
With problems like these out there, what can we do? Two main approaches were presented, both dealing with the fact that the client cannot fully trust the could provider. The first is based on secure hardware (mostly TPMs) and several talks gave scenarios for using hardware-based trust. The second approach is to use software-based cryptography - mostly some variation of MPC, although we heard some good arguments why general MPC won't make it into the real world. However, special cases (Voting and Linear Programming were mentioned) offer good opportunities for efficient MPC-like constructions. The principle of using two (or more) clouds also appeared in several talks.
Wednesday, March 9, 2011
More on PKC 2011
(Written by Georg)
Since Matthew Green from John Hopkins University was unable to come to PKC, I was asked to give the talk for him and I agreed. His paper is on Secure Blind Decryption, a cryptographic primitive which extends a public-key encryption scheme by the following functionality: a User, holding a ciphertext and a Decryptor holding the decryption key run a protocol upon which the User learns the encrypted message. Security demands on the one hand that the User does not learn anything more than the message and on the other that the Decryptor cannot distinguish which ciphertext the User asked to be decrypted.
This primitive has many applications from Oblivious Transfer (with additional properties) to Private Information Retrieval. A practical motivation is that when outsourcing data to an (untrusted) "cloud", not only must the data be protected, but it should also remain hidden which data is accessed. Examples include medical records or patents, where merely the information of which record is accessed can reveal much about a patient's status or a company's intentions.
The author constructs the first CCA-secure scheme with a blind decryption protocol by applying the CHK transform to a variant of a tag-based Cramer-Shoup-type cryptosystem based on DLIN and a new F-unforgeable one-time signature.
Another highlight of the day was Vinod Vaikuntanathan's invited talk on leakage-resilient cryptography. Whereas traditional cryptographic models assume that certain information (such as decryption keys) are completely secret, this is not true in practice: realisations of cryptographic schemes might succumb to side-channel attacks, which reveal parts of the secrets. Leakage-resilient cryptography tries to model these attacks formally and provide (provable) security against them.
Since Matthew Green from John Hopkins University was unable to come to PKC, I was asked to give the talk for him and I agreed. His paper is on Secure Blind Decryption, a cryptographic primitive which extends a public-key encryption scheme by the following functionality: a User, holding a ciphertext and a Decryptor holding the decryption key run a protocol upon which the User learns the encrypted message. Security demands on the one hand that the User does not learn anything more than the message and on the other that the Decryptor cannot distinguish which ciphertext the User asked to be decrypted.
This primitive has many applications from Oblivious Transfer (with additional properties) to Private Information Retrieval. A practical motivation is that when outsourcing data to an (untrusted) "cloud", not only must the data be protected, but it should also remain hidden which data is accessed. Examples include medical records or patents, where merely the information of which record is accessed can reveal much about a patient's status or a company's intentions.
The author constructs the first CCA-secure scheme with a blind decryption protocol by applying the CHK transform to a variant of a tag-based Cramer-Shoup-type cryptosystem based on DLIN and a new F-unforgeable one-time signature.
Another highlight of the day was Vinod Vaikuntanathan's invited talk on leakage-resilient cryptography. Whereas traditional cryptographic models assume that certain information (such as decryption keys) are completely secret, this is not true in practice: realisations of cryptographic schemes might succumb to side-channel attacks, which reveal parts of the secrets. Leakage-resilient cryptography tries to model these attacks formally and provide (provable) security against them.
Monday, March 7, 2011
Ciphertext Policy Attribute Based Encryption
Today at PKC 2011 a talk was given by Hakan Seyalioglu on the paper by Brent Waters, entitled "Ciphertext-Policy Attribute-Based Encryption: An Expressive, Efficient, and Provably Secure Realization".
First, let us consider what attribute based encryption is and why this may be useful. In standard public key cryptography, a file is encrypted under a user's public key. The corresponding secret key (and that key alone) can then be used to decrypt the ciphertext. Now, assume users each have various attributes associated to them. For example, Alice may be in a group called "internal affairs", she is female, and based in the USA office of her organisation. Thus we assign her the attributes "internal affairs", "female" and "USA". If Bob wants to encrypt a document so it can be decrypted by everyone who is a member of the "internal affairs" group, he could create an encryption of the document for every user in this group using their public key. However, what if Bob does not know who is in the group? What if users are added to this group at a later time? In this situation, we can not use standard public key cryptography, therefore we turn to attribute based encryption (ABE).
In ABE, a key authority is considered to be a trusted party who generates keys for users within a system. The key authority has a master secret key (MSK) and public key (PK). For each user in the system the key authority generates keys based on the users attributes, using the MSK. Each user is then given their corresponding secret key, SK. Now, when a user wants to encrypt a document they construct a policy for this document. The policy specifies which attributes are required to decrypt this document, for example ("internal affairs" OR ("female" AND "Canada")). Given the constructed policy and the PK (of the key authority for a system), documents can then be encrypted and distributed to everyone - but only decrypted by users which match the policy assigned to the ciphertext.
Note that if we have the policy ("internal affairs" AND "female" AND "Canada") neither Bob (given the attributes "male" and "Canada") nor Alice should be able to decrypt documents with this policy. We can see that together they meet the criteria - so crucially, we do not want collusion between users to allow them to decrypt documents - only a user who meets the criteria alone should be able to decrypt the document.
Waters introduces a new ciphertext-policy based (as opposed to key-policy based) ABE. When one constructs a new scheme, the security must be considered - formally one produces a proof reducing to some hard cryptographic problem. One consideration when constructing new schemes is what hard problem we wish the scheme to be reduced to. However, ultimately security may depend on whether the hard problem to which you reduce is truly hard. Therefore, Waters gives several different constructions, each reducing to a different problem. One scheme has a ciphertext size of O(n), private key size of O(A) and an encryption time of O(n), where n is the size of an access formula (i.e. the size of the policy), and A is the number of attributes in a user's key. The other schemes have worse complexities, but reduce to different (harder) assumptions. This paper demonstrates well the trade off between assumptions required and the efficiency of the resulting scheme.