Cryptography Lab: RSA Encryption and Decryption Lab Objectives: After this lab, the students should be able to Explain the simple concepts of encryption and decryption to protect information in transmission. Explain the concepts of cryptography, symmetric key, public keys and private keys used in practice. Write programs that encrypt and decrypt data files using RSA cryptography algorithm. Definitions: Public key encryption is a method where two keys are generated, one to encrypt the message and another to decrypt the message. The encryption key is available to everyone. That is anyone can generate an encrypted message for a specific receiver. However, the decrypt key is kept secret. Only the holders of the decryption key can un-encrypt the cipher text. The RSA encryption algorithm was first publicly described by Ron Rivest, Adi Shamir and Leonard Aldeman in 1978. RSA is a popular public key encryption algorithm. It uses two secret prime numbers and properties of modulus arithmetic to generate both the public and private keys. A nice outline of the RSA algorithm and implementation can be found at: http://www.di- mgt.com.au/rsa_alg.html More generally, the public key consists of two values: (e, n) where the plain text message, m, is encrypted (cipher text c) via the following formula: c=me mod n The private key consists of two values (d,n), where the encrypted text c is decrypted by the following formula m= cd mod n These algorithms is based on the theorems of modulus arithmetic. Outline of work: You are to create 3 programs: 1. Key generation program Input: ?? Output: integers : public key, integers : private key 2. Encryption program Input: integers : public key string : plain text message Output string: encrypted message 3. Decryption program Input: integers : private key string : encrypted message Output: string: plain text message In your write-up, just be clear on what I need to do to compile and run these. Key generation program Generation of the keys is imitated with the selection of two large prime numbers, p and q. A requirement can sometimes be that the product of these numbers is 1024 bits long. Obviously, the longer the numbers the more difficult it will be to break this encryption. A suggestion is to pick two different primes between small ranges, like 137 – 311. Randomly pick a number, test for prime, increase the number if it is not prime. The values of the algorithm are built from these two prime numbers. n is the product of these two numbers. n = pq e is built as follows: o z is the product of one less than each of these two numbers. z = (p-1)(q-1) o Choose e such that 1 < e < z And gcd(e,z) = 1 Note from David Ireland (above link): In practice, common choices for e are 3, 17 and 65537 (216+1). The2xse are Fermat primes, sometimes referred to as F0, F2 and F4 respectively (Fx=22 x +1). They are chosen because they make the modular exponentiation operation faster. Also, having chosen e, it is simpler to test whether gcd(e, p-1)=1 and gcd(e, q-1)=1 while generating and testing the primes in step 1. Values of p or q that fail this test can be rejected there and then. (Even better: if e is prime and greater than 2 then you can do the less-expensive test (p mod e)!=1 instead of gcd(p-1,e)==1.) Choose d such that 1 < d < z And ed mod z = 1 Note from David Ireland (above link): To compute the value for d, use the Extended Euclidean Algorithm to calculate d = e-1 mod phi, also written d = (1/e) mod phi. This is known as modular inversion. Note that this is not integer division. The modular inverse d is defined as the integer value such that ed = 1 mod phi. It only exists if e and phi have no common factors. PUBLIC KEY : (e, n) PRIVATE KEY : (d, n) Encryption program Procedure to encrypt and decipher: To encrypt a string message M, it is first converted into message blocks M1, M2, …, Mn (each block can consist of 1 to some number k of characters). Then each message block Mi is mapped to an integer mi (by an array for example). From mi we calculate the cipher text, ci, as described in Definitions section. The public key consists of two values: (e, n) where the plain text message, m, is encrypted (cipher text c) via the following formula: c=me mod n Decryption program Procedure to encrypt and decipher: To encrypt a string message M, it is first converted into message blocks M1, M2, …, Mn (each block can consist of 1 to some number k of characters). Then each message block Mi is mapped to an integer Pi (by an array for example). From Pi we calculate Cipheri as described in Definitions section: The public key consists of two values: (e, n) where the plain text message, m, is encrypted (cipher text c) via the following formula: c=me mod n NOTES: I have read where a block of two characters was used. Each character is assigned to a numerical value in an array : “A” 0; “B” 1, … “Z” 25, ‘a’ 26, …”z” 51, “ ” 52, “.”53 Each two character block now is represented by two integers x1 and x2. These two characters are then mapped to a intermediary value P where P = x1*100 + x2 The cipher text is calculated over P. Ex : “ID” is mapped I8, D3. P is now 803 Cipher text = 803e mod n I have also read where a block of three characters was used. The following is copied from Dr. Arjan Durresi, Louisiana State University ENCODING: Split the plaintext up into blocks of three letters (called trigraphs). Obtain a numeric representation for each letter based on its position in the alphabet (A→0, B→1, etc.). Compute a numeric code for each trigraph using the formula (First Letter Code) * 262 + (Second Letter Code) * 26 + (Third Letter code). For the mathematically inclined, this is interpreting each trigraph as a number in base twenty-six. Encipher each plaintext trigraph code by computing (Plaintext trigraph code)Public Key, dividing the result by the Modulus and taking the remainder. Convert each enciphered trigraph code into a quadragraph – a block of four letters – as follows: o Divide the code by 263. The quotient is the code for the first letter of the quadragraph. The spreadsheet uses the remainder to get codes for the other three letters. o Divide the remainder from the first step by 262. The quotient is the code for the second letter. The spreadsheet uses the remainder to get the codes for the other two letters. o Divide the remainder from the second step by 26. The quotient is the code for the third letter and the remainder is the code for the fourth letter. For the mathematically inclined, this quadragraph calculation determines the representation of the enciphered message as a four-digit number in base twenty-six (using the letters of the alphabet as our digits). DECODING Split the ciphertext up into quadragraphs (instead of trigraphs). Obtain the numeric representation for each letter and compute a numeric code for each trigraph using the formula (First Letter Code) * 263 + (Second Letter Code) * 262 + (Third Letter Code) * 26 + (Fourth Letter Code). Encipher each ciphertext quadragraph code by computing (Ciphertext quadragraph code)Private Key, dividing the result by the Modulus and taking the remainder Convert each deciphered quadragraph code into a trigraph. o Divide the code by 262. The quotient is the code for the first letter. o Divide the remainder from the first step by 26. The quotient will be the code for the second letter and the remainder the code for the third. HINT: It’s easiest to do this in Java. Use the class BigInteger() to handle the math. I don’t know the equivalent in C++ or C. There are some equivalent algorithms out there though.