CompTIA Security+ Chapter 6: Encryption and PKI
Table of contents
Sections |
---|
6.1 Basic Concepts of Cryptography |
6.2 Basic Characteristics of Cryptographic Algorithms |
6.3 Wireless Security Settings |
6.4 Public Key Infrastructure (PKI) |
6.1 Basic Concepts of Cryptography
Going back to the times of old, when the internet was just starting out and protocols were being defined , we were only focusing on establishing communication with the rest of the world, we had no real care for whether or not the bits we transmitted over the wire were encrypted or not. For those that really cared they could just obfuscate their text however they liked - doing things like converting to binary , using simple ROT13 ciphers etc.
By the way, a cipher is format for how we jumble up text. The ROT13 cipher takes the alphabet and shifts it by 13 places , so A becomes N. We run our plain text through , for example hello
and this would be converted to uryyb
.
As more and more people started to realise that these ciphers were not only too weak, but that there were too many malicious actors on the site (from foreign governments to script kiddies), they could have gotten passwords with two hands tied behind their back. In the 80’s standards came in and we got our first proper encryption algorithm which was to become a standard - and so it was aptly named DES for Data Encryption Standard.
The major shift in thinking was to stop using ciphers , and to start using keys. Keys are the most fundamental part of most cryptographic algorithms
With this innovation, cryptography became “PAIN"ful haha.
- Privacy. Cryptography helps us to keep personal information private , sensitive documents that we wish to make private can be done so and the order and peace, and trust between two or more people is greatly enhanced.
- Authentication. Remember that authentication is simply the alignment of two things: The credentials of a person we have stored, and whatever it is that someone enters - claiming to be one of our recognised people. If the two match , and we trust that the protocols in place are somewhat biometric so that the requester can only be that person , then we let them in. Cryptography is commonly used to encrypt a given person’s password and then check it with the one we have stored. Two reasons for this: Firstly it stops people over the wire knowing what the password is, and secondly if we got our DB broken into it wouldn’t be a problem as the criminals couldn’t read anything. However, they still have the encrypted data so if they figured out the format and a way to use the encrypted text in place of our password - as what happens in replay attacks - then this can be deadly. But this replay attack only works when the data isn’t encrypted by the backend server.
- Integrity. Cryptography allows us to encrypt a files contents , and spit out a result. We can then include this result with the file we send off. The recipient can then encrypt the file again and compare the two strings. If they match then the file was never altered and it is deemed trustworthy, otherwise it has been tampered in transit.
- Non-repudiation. When we partake in an exchange, like a bank transfer, we are providing the recipient , and vice versa, assurance of the other’s status in the transaction. This is usually achieved with signatures, as they are authentic features of a particular party. The signature can be recorded by a higher authority and recorded, and so the signature they provide can be checked.
Where do we need cryptography ?
The first case when cryptography was really used was mentioned above, where DES had to come into play so that we could protect data in transit. Nowadays we can’t just encrypt messages, we have to make sure it is from the right person . A package can have our address on it, be properly packed and from a well-known brand , but if we never ordered anything that is one sign that this package might be bunk. Signatures are a vital add-on so we can check who is sending these packets. More on signatures in a later section though as there is a lot to discuss…
I also touched upon the second use which is to be a means of providing integrity and authentication. We can encrypt vital parts of our infrastructure like the kernel modules or bootloader, and so if any hackers try to tinker with it, then the strings change and we can sound the alarms. Data in use , whether these are files or scripts can be regulated by encryption, and only those who know the right code has the right to use said data.
Data at rest is the last one. Akin to a lock and key for entering a shed, we may have an HSM (Hardware Security Module) which plugs into our PC and it does the work of preserving keys, generating them and checking. This can be paired up with our computer, and it does the work of authentication. In UNIX operating systems often times we set permissions on programs, and only those with the password of a certain user may try to run them. The passwords themselves are encrypted so regardless of how many users are on the system , it should be inaccessible to all but one.
The different types of ciphers
Ciphers are the formats we use to encrypt and decrypt data. There can be an arbitrary number of functions that compose over a piece of plaintext, but there are two main ways of handling the plain text.
- The first way is to just encrypt and decrypt whole chunks of blocks of that data. Maybe every 64 bits is considered a block. These types of ciphers are called block ciphers.
- The other way is just to encrypt each bit separately , reading - incrementally - each bit from a file or whatever and then encrypting. This process is repeated until the job is done. This is handy for the cases where we have no idea how much data we have to work with , although we expect the string to be in a range that isn’t too long , in such cases we would use a block. These incremental ciphers are called stream ciphers.
Sort of thing mirrors eager computation versus lazy loading in Clojure - article here.
Both of these ciphers will employ these two techniques when encrypting data. The first is confusion, or obfuscation, and it refers to the process of making the relationship between encrypted data and the cipher as complex as possible. We want to make things as unclear as possible when someone looks at our data. Single obfuscation like ROT13 is decent , if you’ve never seen it before, but nowadays we need to swap out bits of data like the letter “a”, not with some shifted counterpart like “d”, but to a whole different syntax like “dolphin”. Who the hell would know how to reverse engineer the right text from dolphin?
Diffusion refers to the process of shifting bits of data throughout the encryption process. The amount we shift, how often we shift and such is decoded into the format of the encryption key. Moreover it would be extremely taxing to try and reverse engineer a piece of plain-text (even if we had a chunk of cipher text) as we wouldn’t have the key to tell us how often and how much to shift and jumble the cipher text around.
The former relies on substitution , where we could substitute an “a” for “0101010” and so on, and the latter relies on permutation where we shuffle, shift and organise data into a format that should be a pain to brute force. These permutations are kept encoded in the key - which dictates how we encrypt data and how we would go about decrypting it. Now, algorithms like AES use both techniques and the key may be used in both processes, and it could be that we do some shuffling, then we substitute the value for whatever maps to that in our table (private table of reference for the algorithm itself) and then more permutation etc. The two are related in fact, so when we substitute a given section of data the algorithm will also know the corresponding permutation. We do this so we have a way of decrypting the data.
Cipher modes
Counter Mode (CTR) is where we can turn a block cipher into a stream cipher, by reducing the data we feed to the block to be a bit instead. We then add a counter , which is meant to be a function that emits a number which will not repeat for a long time, a cheat code to this is just using 1 and then incrementing… The counter is combined with an IV to produce the input into a symmetric key block cipher, so that this group of numbers and the text itself creates a block which is difficult to crack.
Electronic Codebook (ECB) is a block cipher that processes units in blocks, padding them out when they aren’t a particular size. The problem is that ECB encrypts with a serious lack of diffusion, which we should remember are the permutative operations. The weakness of ECB can be highlighted in the encryption of this bitmap image on Wikipedia:
We can easily distinguish the ECB results, and this a proper encryption algorithm like CTR adds that vital bit of obfuscation to make it impossible to intuitively decode; likewise, if we had an algorithm that only did diffusion, then we would see the coloured parts of the penguin scattered across the page, and we could map each part to the squares, and see how they differ as we add a few pixels more.
See here for a great rundown on why you should never use ECB for any real encryption algorithms , even if they’re parallelisable !
Galois Counter Mode (GCM) uses what’s called Galois multiplication , which is used in Elliptical Curve Cryptography and AES with the implementation of a counter mode cipher. It doesn’t carry the convention of streaming though, as Galois multiplication makes heavy use of vectors, so it’s a lot easier to do chunky blocks like 128 bits.
Cipher Block Chaining (CBC) is quite similar to ECB, but it doesn’t have the parallelism aspect, due to the fact that it will encrypt one block, and then include some of that block to be encrypted in the next one. Now whilst this is quite good at obfuscation and diffusion, it is quite slow.
It processes 64-bit blocks of data, but inserts some of the ciphertext created in each block into the next block, this is where the chaining comes in. Problem is when this tries to go over the wire, if there is any problem in transmission the whole thing needs to be redone and you will never get your proper text back.
Moving onto hashes
Now, ciphers have n number of functions , f(x) applied over the data set x, but with hashes there is no inverse function to get us back to the text itself - once we encrypt that’s it. Passwords are a great example of when to use hashes, as we don’t really care about the password itself, as long as it’s strong enough then we just ask the user to input their data , we’ll hash it and compare. With hashing it’s not so much to do with the content of the data but the context of the data.
There are a few algorithms which hash strings, like MD5 (128 bit key) , SHA128 (which uses a 128 bit key). Likewise SHA256 uses a 256 bit key etc.
Side-channel attacks
Now this is one form of attack that , whilst incredibly difficult to implement , can lead to data being extracted and it’s done by getting right up close to the system , setting up some logging equipment and from the patterns in the electromagnetic radiation we can deduce higher energies for “1” and lower energies for “0” . To defend against this we would need a highly resilient algorithm, now the exam states the word resilient - in this context - to mean “the compromise of a small part is not compromising the entire system itself”. So an algorithm should have this aspect of data usage in mind, and especially so it would like to be implemented on mobile devices , and if it can tick the resiliency box then the side-channel shouldn’t mean we are totally out of the game.
6.2 Basic Characteristics of Cryptographic Algorithms
Symmetric Encryption , same key for encryption and decryption
- DES
- AES
- 3DES
Asymmetric encryption. Where there are four keys in use : Each person has a public and private key. A public key to encrypt and a private key to decrypt. More in this later
- RSA
- DSA
- Elliptical Curve
Symmetric Encryption
DES is a common symmetric key algorithm and it’s been around since the 80’s and will sadly be going into retirement in 2023. DES uses a 64 bit key, but for each 8-bit slice , one of the bits was used for parity purely for transmission reasons (this was in the olden days…) and this extra bit was preserved so that when the key and the parity bits were handed over they should sum to an odd number.
3DES was the ad-hoc replacement for DES , as it was found in 1998 that the DES key of any kind could be cracked in a day - hence the attacker could decrypt every single thing encrypted with that key. I’m sure you know the old saying “Three keys are better than one”, and that’s exactly what they did though it isn’t always done with three keys - sometimes we can get away with two keys. First we take our text , and DES key number 1, and encrypt the data. Then, using key 2 we try and decrypt the data , but the permutations and substitutions we make thinking we get back to the original text only jumble the text further, and with key 1 again we can encrypt this mess. Such a version of Triple DES is called DES EDE2 , where the E stands for encryption and the D stands for decryption, and the order which they occur.
Other versions of Triple DES include:
- DES EEE3. This uses three keys, all just for encrypting.
- DES EEE2. This uses two keys, the first , the second and then the first again.
- DES EDE2 . The one we already mentioned.
- DES EDE3. This uses three keys, but we don’t leverage the quirk this time, we stick to unique keys.
Advanced Encryption Standard (AES)
In 2002, NIST chose this standard as the one which would replace DES. The encryption algorithm itself is called Rijndael and it is the brains behind the operation. This is the most used block cipher, and it can work with blocks of 128 bits. Larger blocks , if doing a block cipher stream are better than shorter ones as it means you can jumble up and mangle content in a greater number of ways, and it makes brute-forcing the content more difficult.
Forget the weak 56-bit keys, we now have 128, 192 and even 256 bit keys. So even if you know the algorithm, because the algorithm itself changes results so drastically for each key, it doesn’t really matter. It use multiple rounds of encryption for each key length, wrapping and jumbling the block in any way it can.
- 128 bit keys have 10 rounds
- 192 bit keys have 12 rounds
- 256 bit keys have 14 rounds
Stick figure guide to AES Encryption
More symmetric algorithms you need to know
- RC4 / RC5 / RC6
- RC4 is a stream cipher, but RC5 and RC6 are block ciphers
- Works with key sizes between 40 and 2,048 bits
- All these ciphers are called Rivest Ciphers, named after Ron Rivest. RC4 was well-suited for WEP as the packets were variable and the encryption keys could scale to meet the demands of the data.
- Blowfish / Twofish
- A symmetric block cipher that can use variable length keys (from 32 bits to 448 bits)
- Twofish is a specialised implementation that uses 128 bit blocks.
- International Data Encryption Algorithm
- 128 bit key
- Similar to DES, but more secure
- Used in Pretty Good Privacy (PGP)
- One-Time Pad
- This is the most secure crypto implementation, as it will use a key only once for a single encryption session, and then destroy it. It gets away with only a single key by making it as long as the plain-text message, though this is a bit weak, the functionality and obfuscating nature of OTP is enough to make the data difficult to brute-force.
- disproof of the OTP as being unbreakable
- Skipjack
- NSA developed block cipher used in clipper chip
- Uses an 80-bit key to encrypt 64-bit blocks of data
Asymmetric Cryptographic Algorithms
Diffie-Hellman was the first public-key exchange algorithm, and was decided so that two users who had never spoken before could setup an encryption channel where they could send their symmetric keys to each other. The problem is though, due it being there as a means of establishing connections with strangers, a hacker could become a MITM and setup connections with the client on the left and right sides without any issues.
RSA (named after the three creators last names) is the de-facto asymmetric encryption algorithm and it is used pretty much everywhere. It supports a key size up to 2,048 bits.
Digital Signature Algorithm (DSA) is used to provide digital signatures. It focuses more on the integrity of data , it’s non-reproducability rather than its confidentiality. Which makes sense, we don’t want to obscure someone’s signature, we need to know who they are!
Elliptical Curve Cryptography (ECC) is where we create a key using finite fields (Galois fields) , using only real and rational numbers. The trick to it is described here, but it is very fast and the key strength is much stronger as the ability to guess is drastically lower.
Pretty Good Privacy (PGP) makes use of asymmetric algorithms to sign and encrypt files, folders, entire disk partitions, texts , emails and the list goes on… If we were to use RSA for example as we’re signing a file , this is how we’d do it:
Create the RSA key pair, add an expiration date if we wanted and we can use our public key to encrypt the file, and our private key to decrypt. What would often happen is that we push our public key onto a public server , so that other people can send us files , and if they trust me they would then add their signature onto my key, so that people know who it really is. Often times we would use a combination of my private key, to encrypt the hash of the file , and their public key to encrypt the contents itself before sending it to them: this way they can use my public key - to check if it is really me, and to see if the hash they now have is the same as the one they do on their end, and lastly they can decrypt the file with their private key. So you can see that there is a sort of web of trust, where people will incrementally add people to their “trusted list” , there is no hierarchy and you can make connections with whomever you want.
Hashing Algorithms
Hashing is where we don’t care about decrypting the input. Hashes are used for integrity, for confidentiality and for authentication purposes. They take a variable amount of data and compress it into a fixed-length string (remember there is no point preserving, we don’t worry about going back, just make it a herculean effort to crack).
Hashing is a nice way to see if anything has been changed, or when someone tries to enter in the wrong password. A quick and dirty checking mechanism.
Message digest algorithms are the most common form of hashing, and as the name implies it take a string and returns the equivalent hash. Ron Rivest, the same guy who made RC4-6 and was involved with RSA , also designed the entire MD series. MD5 is the most commonly used hashing algorithm in use today, and processes a variable-size input and produces a 128-bit output.
Secure hashing algorithm (SHA) is very similar to MD5. SHA-0 and SHA-1 have been deprecated and are no longer recommended, but the SHA-2 family is a great replacement. SHA-2 covers the implementations:
- SHA-224
- SHA-256
- SHA-386
- SHA-512
with each producing a message digest possessing a bit length as seen in their respective names.
Hashed Message Authentication Code (HMAC) is different to the other hash functions as it is a little bit smarter. See, when I generate a hash with say MD5 it may be that other words that I supply to the function may result in the same hash. This is called a collision. Collisions are potentially disastrous for something like passwords, and whilst it didn’t happen often , it’s sort of like saying to someone with a pacemaker - “Well, it doesn’t go off all the time !” We need consistency. This is done by having a secret key that is stored on one’s computer, and this alongside the message itself is hashed. This hash can then be sent alongside the file and it acts as a signature, as only the other person who has the same secret key (maybe given over Diffie-Hellman or something) would be the one sending it. Obviously this is how it would go in a perfect world, where hackers wouldn’t have access to such messages and files, that could possibly be used in some kind of relay attack. But then in a perfect world would we have hackers?
HMAC can use SHA, MD5 or any other hashing algorithm - its genius is just in the combing of a secret key to produce the overall Message Authentication Code (MAC). So even if someone knew the hash for our word, they still would have had to guess our secret key too.
Extra information on HMAC :
- Stack exchange
- [Guide]([https://www.jscape.com/blog/what-is-hmac-and-how-does-it-secure-file-transfers#:~:text=HMAC%20stands%20for%20Keyed%2DHashing,almost%20similar%20to%20digital%20signatures.](https://www.jscape.com/blog/what-is-hmac-and-how-does-it-secure-file-transfers#:~:text=HMAC stands for Keyed-Hashing,almost similar to digital signatures.))
6.3 Wireless Security Settings
By the end of this I hope that you will be able to explain wireless cryptography settings.
Wireless Cryptographic Protocols
Because WEP is so old, and was retired in 2003, it isn’t going to come up in your exam - though it is worthwhile knowing its history as we get to understand what needed replacing , what indeed replaced it and where we are now.
WEP stands for Wired Equivalent Privacy , which is quite ironic. WEP uses an RC4 stream cipher algorithm for both authentication and encryption which only uses a 40-bit key, which is orders of magnitude weaker than modern implementations which use AES for their symmetric encryption needs. When a router offers wireless communication services, using the WEP standard , the key is shared with every device on the network (the 40 bits) alongside a 24-bit initialisation vector which takes the total key strength to 64 bits; however, the strength of the algorithm stops here as after a while the algorithm will run out of unique initialisation vectors and start reusing them. This means that an attacker can keep trying to join the network, brute-forcing this IV until we’re able to find our way in.
In 2003, WEP was retired and a new standard called WiFi Protected Access (WPA) came around. Initially it was only a sequence of patches that covered up the glaring faults in WEP - faults which allowed massive companies like TJ MAXX to be hacked and over 15 million credit card details were leaked. WPA2 was realised shortly after in 2004 and has been a stable, and constantly improving standard even today. In January 2018 the WiFi Alliance , the institution whom issues new versions of this standard launched WPA3 , with yet more security enhancements.
WPA is also based on the RC4 cipher, but the kicker was to enforce stricter policy. This regimental change and incremental technical improvement was what allowed it to be integrated fast - as it offered backwards compatibility. The use of what’s called Temporal Key Integrity Protocol (TKIP) has a set of functions which changed the game:
- The use of 256-bit keys. This is an easy thing to include so let’s just bump the key size up.
- Per-packet key mixing. Each packet had a unique key generated for it. Here you can see why the incremental cipher is quite nice as we never really know how much data we’re working with and hoping to send, so incrementing makes sense.
- Automatic broadcast of updated of updated keys.
- Message integrity checker. Just adding a piece of hashed code to the tail of the packet, representing the contents with which we can check for validity.
- A larger initial vector size : 48 bits.
WPA stood its ground and said - “Look, although TKIP is a great protocol and it has greatly improved the way we send packets , it isn’t enough”. And with that WPA2 dropped the RC4 cipher (finally!) and replaced the protocol too, instead of TKIP we now have Cipher Block Chaining Message Authentication Protocol (CCMP). This protocol relies on AES for encrypting its packets, and we scrap the idea of having to need incremental encryption , as we can just include a padding for each packet if need be to get it to 128 bits.
WPA does support TKIP as a fallback if a device cannot support CCMP :/
WPA2 protects data confidentiality by allowing only authorised network users to receive data, and uses CCMP’s block chaining message authentication code to ensure message integrity. WPA2 also introduced easier roaming from network to network, allowing clients to move from one WAP to another without reauthentication. This process , from the perspective of the WAP we walk into , is called preauthentication.
Wireless Authentication Protocols
Extensible Authentication Protocol (EAP) is more of a framework that is used to create different types of authentication. What a great idea… An extensible protocol is almost oxymoronic to me as the whole point of a protocol is to lay some ground rules and establish a core idea, whereas EAP allows the layman to piece together some sort of authentication standard.
Much the same problem arose with the X.500 specification for active directory servers, and from public outcry we got a lighter , simpler standard which actually does alright (LDAP). A lightweight EAP was also made, aptly called LEAP. The problem was though it still contained pretty serious vulnerabilities and so Cisco came around and published the EAP-FAST (Flexible Authentication via Secure Tunnelling) protocol. Essentially just getting EAP to work under a TLS tunnel. By using certificates , we introduced a degree of proper authentication into the mix.
EAP-TLS is the simplest standard for implementing EAP through a TLS tunnel for the common device, whereas EAP-FAST was more concrete, and done to on the router side.
PEAP - Protected EAP - which was another crack at making EAP work , jointly developed by Cisco , Microsoft and RSA Security.
IEEE 802.1X
This is one of the standards issued by the IEEE (Institute of Electrical and Electronics Engineers - Standards Association) and this specific standards governs port-based network access control.
You cannot have access to the network resources unless you have completed the authentication process.
There are usually three devices used for authenticating.
- The client device, who is requesting access. This is called the supplicant.
- Second , the switch or WAP that is sometimes referred to as the authenticator. The authenticator will then push authentication requests to our next entity, the server.
- Lastly is the authentication server. This is our TACACS or RADIUS server. This does the work of authentication and lining up their details with the ones stored, and they send either a “Yes this is good” or “OMG never accept” packet and the authenticator which is the gatekeeper essentially grants or disallows access.
6.4 Public Key Infrastructure (PKI)
PKI isn’t a specific technology , but much like corporate infrastructure it just packages together the essential parts we need to uphold a certain standard or to instantiate an important idea. PKI starts with the Certificate Authority (CA) that is a business trusted by the government to issue certificates that other companies can stick on their website and say - “Yes I’m the real one”. The certificate comes with all sorts of information about the company and the issuer themselves, it also contains a public key with which this company will be encrypting their data with. If you use Wireshark you see the TLS certificate being responsible for encrypting all the darn data…
So PKI uses asymmetric encryption to communicate. Remember the company’s webserver will use your public key for encrypting their packets, as otherwise they would talking to you in a language you don’t understand, and so your private works at decrypting it; likewise when you wish to communicate with the company you use their public key. That way the private key they have which is out of reach of the client can be used to decrypt the packet.
Authentication is provided as we can look at the signature of the sender and , if we’ve got nothing better to do, check this with the CA. Another box it ticks is privacy (confidentiality) as we’re using a proper encryption algorithm like RSA. Non-repudiation is provided, which is just non-ambiguity on one’s status, as we will have a packet sent that confirms whether or not they got our packet.
Digital certificates, not just for the web as we’ve come to predominantly know them, are data files that are used to link an entity (user or system) with a public key. We want to know who owns this public key, so that we feel safe enough to start creating a session, and so the certificate lists all the necessary information about the user of that certificate, here is an example picture:
You can see in the image that this is the protocol for an X.509 certificate.
When we make something like a Certificate Signing Request (CSR), we need to include as much of the details seen here as we can on our own, things like generating public and private keys can be done, our Issuer name, what validity period do we want etc. We hand over these details , that should cover our organisation and web server adequately , and the Certificate Authority (CA) will then use all this to generate the SSL certificates. The CA is a trusted entity that issues digital certificates. Certificates issued by the CA are known by the government and follow strict procedures, so that we may trust them and when we go to a website we can see that the CA they use is one that the browser recognises and so engages in conversation.
Once we’re signed up we should get our cert back and we can include it on our webserver, how you do this will vary, but if you’re using the Apache web services then the certs need to come in a .pem
or .crt
format. This guide does a good job of explaining how to get certificates running on your web server.
From then on out it’s basically happy days… Until you find out one day that your certificate got thrown onto the CRL (Certificate Revocation List). Basically your CA thought your certificate shouldn’t be trusted anymore, for whatever reason, and so they break the hierarchy of trust (from your certificate all the way to their root certificate) by blacklisting it before its natural death (expiration date). This would happen normally as a result of the corporation losing their private key, maybe this company got hacked and need to reinstate parts of their infrastructure. However , the Certificate Revocation List itself isn’t broadcast all that often, every couple days or so - meaning expired keys can still be accepted by devices in this time. But what clients can do (if they want to see whether or not they can trust a business) , and what companies can do to see if they are on the naughty list is to make use of the Online Certificate Status Protocol (OCSP) , which is sent over HTTP. In your HTTP request include the certificate name and the OCSP server will tell you whether or not they have been added to the CRL in real time. Now you can imagine that the OCSP server might get battered by requests from all these businesses that want to know if they are dead in the water, so what ended up happening was that the CA said - “Look, in the certificate I’ll make sure to include a time-stamped response, where at that time I will go and check for you and update you on your status”. This is called OCSP stapling, and it eliminates the need for clients to contact the OCSP server directly (as they could even do it before every TLS handshake…)
There are three levels that the OCSP would return to describe the status of a certificate:
- Valid, which is where a cert is accepted and recognised as being a fitting symbol of recognition
- Suspended, which is where a cert is should not be used, and is not deemed secure, though it might be reinstated. It depends on the circumstances, and your relation with the CA. It may be that you want to change certificate provider, and hence ask to terminate all current contracts.
- Revoked. This is where a certificate may be too old now, or the webserver was compromised, of which the CA would have to be made aware of , and then reissue certs.
Another nicety of PKI is the idea of pinning. We go to google.com
all the time, and we hope to see the same public key in the certificate. What would be better for popular sites that we always go to would be to keep hashed versions of the public key in our web browser , and then compare the hash with what’s provided by the website, to see if this is fraudulent or not . It negates those sorts of attacks like typo-squatting where we type gooogle.com
and we go to some hacker’s evil lair. The server’s for google.com
can do what’s called HPKP (HTTP Public Key Pinning) which is where they put the hashed public key in their HTTP header for us to compare. This comes with an integrity check too, as some hacker could just take the hash and include it in their HPKP header and waltz around just the same. But the integrity of the rest of the certificate + HPKP + our own browser storage makes for a sweet combo.
Within desktop applications, and any type of app you can think of , we can embed a certificate into the app itself and during the handshake process between the app and a service say, it can see the certificate present on the service and whether or not it is one that it deems trustworthy. If it sees that the certificate on the SSL accelerator isn’t one of the targeted recipients then it can choose not to function , display an error or whatever it chooses. So certificate pinning is one way we can stop snoopers.
PKI Concepts
So the actual corporate public key doesn’t have to be plugged in with the certificate as we now know that it can be detached. Key escrow takes this one step further and this is a third party that manages our public keys for us - maybe we have a lot of websites for different countries, different subdomains etc… So this would be a very handy service. We put trust in this third party to safeguard our keys, as they now have access they could pass as us. Some might say - “Oh, but in the event they lose the key, then the data you encrypted with the public key is lost and you can’t operate anymore”. Do you know how much they could pay you out if that happens ? Also, they have a Key Recovery Agent who will work their balls off to recover your key, either the components of the key itself and/or the plain-text messages you need decrypting.
As we can see a chain of trust can get pretty complicated as it spreads out, but certificates also keep their own hierarchy of trust i.e subdomain must trust the root domain of the website, which then must setup a relationship with a CA who has their own hierarchy of certificates all the way to the root. The certificate chain is an ordered list of certificates that contains the SSL certificate itself , for the web server, and the CA that issued it. This way we can verify the sender and CA are all trustworthy. Each cert can be used to verify the next one, as the last contains the information for verifying the next.
Types of certificates
Starting from the top of the chain down, we have the root certificate. This is the most important certificate in a PKI environment as this is the one that identifies the root CA, and all certificates will be tied to this one. The root certificate is what allows us to issue other certificates.
User certificates are used to associate a user with a certificate, their function is to declare a user to the public, who has their public key for people to interact with, their user ID , and whatever identifiable information they have put on there. This can then be used as a form of authentication. These can be used all over the place, but a common use for them is to be a drop-in email certificate.
Web Server SSL/TLS Certificates
- Domain validation (DV) certificates, which does what it says on the tin. A website owner will get one of these sorted when trying to setup TLS encryption for their website. The provider of DV certificates will send an email to the specified domain, listed on the
whois
records, and if there is a response then the provider updates their listings with the given domain as a reputable source. - Organisational Validation (OV) certificate. Organisations are searched against official government sources for listed corporation entries, and if they are listed as a compliant, regulatory corporation then they can be issued with an OV certificate for public-facing websites.
- Extended Validation (EV) certificates. This is the highest level of trust, which requires a comprehensive level of introspection to validate the business. These certificates can only be issued by a subset of CA’s and it involves the applicant proving their legal identity.
- Wildcard certificates. A wildcard certificate is a digital certificate that is applied to a domain and all its subdomains. Wildcard notation consists of an asterisk and a period before the domain name. Secure Sockets Layer (SSL) certificates often use wildcards to extend SSL encryption to subdomains.
- Subject Alternate Name (SAN): Special X.509 that allows additional items (IP addresses, domain names, and so on). This would be used when we want to support multiple domains in the same certificate; moreover allowing us to put a subject alternate name , an extension where we can list all of the domains we want associated with the cert.
Internal Certificates for corporate, or even home , use
Say an FTPS server (using TLS) is hosted within the company, and it will only ever be used by the employees, then it doesn’t make much sense to get a third-party to check we exist, our existence is verified by those closest! Woah deep stuff… But seriously why splash out if we can just sign certificates ourselves and have the company act as the root CA, with servers like the FTP one being issued one of our self-signed certificates. We could also stick these certificates on all client machines, this way it means we can check whether the devices using our corporate network are those we have verified. This establishes another layer of authentication, which is handy for thwarting off attackers.
Certificate File Types
File Formats | |
---|---|
.DER | Distinguished Encoding Rule. A very common format, typically used on Java Platforms |
.PEM | Privacy Enhanced Mail. Most common format is ASCII , so it is simpler for us mortals to read. |
.PFX | This is a binary format for certificates , which holds all intermediate certificates and a private key in one encryptable file. PFX is the Windows version of the PKCS#12 certificate format. |
.CER | In Windows you might come across a .PER file. This contains only the public key, this combines with the .PFX file which holds the private key. |
.P7B | These contain only certificate information and chain certificates, not the private key. |
We can interchange these formats with a tool called OpenSSL if need be.
So all certificate formats are derivable from the .der
format , which is in a binary representation of the X.509 certificate information. But as time went on and people wanted to secure mail with these certificates, it became necessary to have another way of encoding the data. This idea spawned the creation of the .pem
certificate format, which encodes certificate data in the Base-64 ASCII format, which is much easier to transfer over the wire. This PEM-style encoding has been used in more than just this format though, with certificates like .der
, .cer
and .crt
all being able to be translated into this encoding. Files with a .key
extension store private or public keys, and also encode their data into the base-64 encoding - hence making use of PEM too.
Now we have the Public Key Cryptography Standards (PKCS) which are a number of rules for the proper procedures, treatment and formatting of keys and certificates though each standard will define rules for separate cryptography matters. PKCS#7 for example is holding public keys , and for holding chains of certificates to by processed. It is often used in Windows environments, and often used alongside S/MIME for emails.
PKCS#12 on the other hand is the type of certificate file format which holds certificate chains and private keys is usually denoted by the .pfx
file type. Though this format is often used online for this reason, and it is often used to archive or transport a private key to a server. Then when it hits the server I presume it is translated into its .cer
equivalent with that file containing public keys and certs, whilst the .key
file holds the private key.
Example Questions
A security administrator is configuring the encryption for a series of web servers, and he has been provided with a password protected file containing private and public X.509 key pairs. What type of certificate format is MOST likely used by this file?
A. PEM B. PFX C. P7B D. CER
.CER
and .DER
are often used in Java-based web servers, for holding public keys and certificates so it wouldn’t be them. If we look at our notes above we can see that the only format that handles all this would be the .PFX
file type. .P7B
is a PKCS , but not the one governing standards for private keys.
You need to examine some additional information about a key, specifically to validate the address information of the certificate owner. What could you examine to accomplish this?
A. OCSP B. OID C. Private key D. Public key
The last two don’t make sense as we don’t look for background information about a key with the key itself… and we don’t use the OCSP to find certificate owner information, that gives us the status of the certificate: whether it is still valid or not. So that leaves an OID, which is an identifier, usually dotted decimal numbers like this 76.7.54.34
which would help with identifying certain objects. If you look under the extensions in any certificate you will see OIDs crop up: