How do we guarantee confidentiality and integrity of message over the network? (ft. symmetric key, public key, hash, digital signature, certification authority)
Network is always vulnerable to attacks. In particular, the existence of packet sniffing means that whatever is being transferred across the internet can be read, bit by bit. So rather than denying the potential for leaking the packets, network engineers have come up with several very robust, mathematical solutions that even if you know what the bits that are being transferred are, you cannot decrypt it unless you are a receiver. There are probably so many questions that spring to mind, so rather than keeping this in the abstract domes of encryption/decryption, let's put it in more concrete terms.
Confidentiality
Confidentiality is making sure that only sender and receiver knows the meaning of the bits that are sent across the network. There are two ways to achieve this:
1. Symmetric key system (same shared key s)
Example that illustrates Symmetric key system: suppose A and B decide on one-to-one mapping of alphabets to alphabet (a -> b, b->f, g->w...). Then A will 'encrypt' the message using this mapping (mapping is 'key' here), and B will 'decrypt' the message using the same mapping (key). Since A and B use the same key, it's a symmetric key system.
Note that here: Mapping == key, Encryption method == encryption algorithm == cipher
Symmetric key system evolved like: Caesar cipher (using alphabetical offset for mapping) -> Monoalphabetic cipher (substitution via randomised mapping) -> polyalphabetic encrpytion (idek)
Symmetric encrpytion technique (cipher):
- stream cipher (not sure how it works, we don't go into details about this)
- block cipher (used for PGP(email), SSL(TCP), IPSec(network layer transport)
Block cipher:
- Key: mapping of k-bit (k = 64 usually) to other k-bit (there are (2^k)! mapppings)
- It works by chunking the message into k-bit chunks, then using the mapping for each chunk.
- Why is this symmetric? if key (mapping) is known, then encrpytion and decrpytion is possible
- Examples: DES, SDES, AES (best) => functions to generate mapping
How is symmetric key distributed? Diffie-Hellman algorithm from 1970s
2. Public key system (K_a != K_b, one is secret and one is public)
Public key encrpytion (public key cipher) is almost always done through RSA algorithm.
RSA does two things for us:
1. choosing public/private key
To do that, we firstly need to choose some numbers:
- each of n, z is function of p,q, where p, q are large primes.
- each of e, d is function of n,z, where e is for encryption, d is for decryption.
public key: (n,e)
private key: (n,d)
2. choosing encryption/decryption algorithm
say message to be sent is m. (m bits < n bits)
Encryption (encoding, c) = m^e mod n
Decryption = c^d mod n
RSA is slow: use both public and symmetric key system ('session key method')
- Session key is generated (Symmetric session key through Diffie-Hellman)
- Then use public private key of RSA (how? idk)
Message integrity
Message integrity is a concern separate from encryption. I took a bit to understand why, but it's pretty simple. Both sender and receiver might not mind their conversation being heard, but they might want to make sure that what sender is sending is not tampered (changed), and receiver wants to be sure that the message is actually coming from the sender (origin guarantee). This can in fact be achieved without encryption (whose purpose is to make sure noone understands the conversation, and does not guarantee origin in the case of public key system - though symmetric key system guarantees origin)!
Before further discussion, we need to understand hash function:
Hash function is a function that outputs a fixed size bytes-string called hash, or digest.
Notation: H(m), where H() is a hash function, m is a message, H(m) is a hash/digest.
Important property: given any messages x and y, we guarantee H(x) != H(y).
Hash function example: MD5 (128 bit hash), SHA1 (160 bit hash -> more secure). Any hash function H() is publicly available, so anyone can use it.
Message integrity requires two things:
1. Not tampered in delivery
Alice can send [m, H(m)] to Bob
- Alice hashes m and get H(m). Send both m and H(m) to bob
- Bob hashes m himself, and get H'(m). If H(m) == H'(m), m wasn't tampered
=> this does not guarantee origin of the message to be Alice, because anyone could have used H() to send pair of [m,H(m)]. However, it guarantees that the message has not been tampered.
2. Message originated from the sender
Alice can send [m, H(m+s)] to Bob
- Alice and Bob both have a symmetric, shared key 's'. (H(m+s) part is usually referred as MAC (message authentication code))*
- Alice hashes m+s and sends both m and H(m+s).
- Bob hashes m+s itself, and get H'(m+s). If H(m+s) = H'(m+s), then we can infer that m probably originated from Alice. (probably, if there are only Alice and Bob in the world, and they both know the key)
=> however, this symmetric shared key, s, is not unique to Alice. s can be shared by many people, and we cannot guarnatee that a message came from Alice.
Solution to this is 'Digital Signature' (though what we did above is also referred to as Digital Signature), we use public key system instead of symmetric key system.
* note we didn't need any encryption to produce MAC. Also, hash isn't really encryption, so we can see that encryption is independent from message integrity.
Digital Signature
Digital Signature: privKey(H(m))
Properties: privKey is unique to one person (unless the person shared it).
Usage: Alice can send [m, privKeyOfAlice(H(m))] to Bob
- Alice hashes the message and encrypts it using private key (why encrypytion of hash not message? because RSA is computationally heavy, and RSA on fixed bits produced by hash is much faster)
- Bob uses Alice's public key to retrieve H(m), and sees whether he can produce H'(m) with received m, such that H'(m) == H(m)
=> this can guarantee origin of the message... but how do we know the public key that Bob uses is actually from Alice? Certification Authority.
Certification Authority
Certification Authority creates a certificate, and that certificate is made up of CA's private-key-encrypted message and public key of Alice. If Alice sends Bob her certificate, Bob will:
- use public key of CA to ensure that the certificate came from CA
=> note that public key of CA is available in browsers by default.
- then retrieve public key of Alice embedded in the certificate.
Now, Bob can trust this Public key of Alice actually came from Alice, as much as he can trust that CA has gone through rigorous process to check that the public key belongs to Alice. (this involves document checking, phone number checking ...).