SSL Key Exchange example
Last month, I reviewed an SSL handshake up to the key exchange portion. In this post, I'll pick up where I left off and cover the actual key exchange itself. The key exchange I'm demonstrating is a cutting-edge, modern Elliptic-Curve Diffie-Hellman (ECDH) key exchange. So, before I get down into the details of it, I'll review exactly what that is and how it works.
Diffie-Hellman key exchange
Elliptic-curve Diffie-Hellman refers to the "standard" Diffie-Hellman key exchange protocol being carried out with elliptic-curve primitives. This suggests, then, that there's a "plain-old" Diffie-Hellman that accomplishes the same thing using non-elliptic curve primitives. And in fact, there is, using discrete logarithm primitives, which is what Whitfield Diffie and Martin Hellman first specified in 1977 to give rise to the name. Since "discrete logarithm" Diffie-Hellman is slightly easier to understand than ECDH, I'll review it first here.
Although it took a stroke of brilliance to think of it, the Diffie-Hellman key exchange is
actually pretty simple (as they say, most brilliant ideas are). The goal of a DH key exchange
is for two cooperating parties to agree on a shared secret over an insecure medium such as, say,
the internet. They do so by first agreeing on a pair of number g
and p
:
p
is a prime number, the larger the better. g
is a primitive root
p
, and these numbers are exchanged in "the clear". Now, one side picks a random
number x
and the other side picks a random number y
. They don't
exchange these numbers, though - instead, the first party computes x' = gx % p
and the other computes y' = gy % p
, and they exchange x'
and y'
. Finally, the first party computes z = y'x % p
and
the second computes z = x'y % p
. Due to the distributivity of the
modulo (%
) operator, both sides are guaranteed to compute the same number
z
. However, due the intractability of the discrete logarithm problem, a nosy
observer is unable to determine x
, y
or z
: there's no
known algorithm to compute x
given x'
, g
and p
besides trying every combination until you find the right one (and a lot of evidence that no
such algorithm will ever be found). As long as x
, y
and p
are large enough numbers — usually on the order of 2048 bits — the key exchange is
secure since it would take longer than the age of the universe for somebody to try every possible
x
.
Strictly speaking, almost any number g
would work for this scheme. However, certain values
of g
are more secure than others. Consider, for instance, if p
was
7 and g
was 2. Now the first party picks a random x
value of 4 and
computes x' = 24 % 7 = 2
. The second picks a random y
value of
5 and computes y' = 25 % 7 = 4
. Now person X computes
z = 44 % 7 = 4
and person Y computes
z = 25 % 7 = 4
. Looks good, right? Well, the problem is that
2n % 7 is always 1, 2, or 4. Remember that one of the goals is to thwart brute force
attacks. Since the search space is reduced here, brute force attacks are significantly easier.
That's why g
must be a primitive root of p
: this guarantees that
gn
% p can range for 0 to p
.
There are two "types" of Diffie-Hellman exchange - ephemeral and fixed. The difference is how the parameters g and p are exchanged: in an ephemeral Diffie-Hellman key exchange, g and p are exchanged at the time of key exchange and can potentially vary from one handshake to the next; in a static exchange the same g and p parameters are used for each invocation. They mostly differ in convenience, and most people prefer the ephemeral type.
Note that neither side knows z
ahead of time, so this scheme can't be used to encode
data, but it can be used to share a secret which is then itself used as a key to a bulk-encryption
algorithm. There is one crippling flaw in the scheme, though, if the attacker is more than a
passive observer - the man in the middle attack. In this case, the attacker situates
himself in between each party and intercepts the incoming values, exchanging those for his own.
In effect, each side is carrying out a Diffie-Hellman key exchange with him, thinking that they're
communicating with each other.
Digital Signature Algorithm
The man in the middle attack can be thwarted by attaching digital signatures to at least one of the key exchange messages to ensure that they're legitimate. The idea behind digital signature algorithms in general is that the signer computes a private key and a public key (both of which are numbers) that are related such that a message can be encoded with the private key in a way that can be verified using only the public key; if the encoding verifies, this is taken as proof that the holder of the (secret) private key "vouches" for the authenticity of the message.
One such algorithm, the US government standard, is named DSA
(which stands for Digital Signature Algorithm — leave it to the U.S. Government to
find creative inspiration), which is a minor variant of Dr. Taher ElGamal's elgamal
signature
algorithm. DSA is somewhat more complex than Diffie-Hellman, but is based on the same concepts:
in particular, the intractability of the discrete logarithm problem.
Like Diffie-Hellman's g
and p
parameters, DSA
has
three parameters, named p
, q
and g
. p
and
q
are large, random prime numbers and p - 1
is a multiple of
q
. g
serves the same purpose in a DSA exchange as in Diffie-Hellman,
in that it's the starting point for exponentiation; its value is h(p-1)/q % p
for some h
.
Unlike Diffie-Hellman, though, DSA signatures aren't computed in "real time"; instead, the signer
computes a pair of values called the public key and the private key, shares the public key, and
keeps the private key private. The private key is a random number x
less than
p
and the public key is another number y = gx % p
.
The signature process involves first selecting a random number k and computing two values:
r = (gk % p) % q
s = (k-1 m + xr) % q
Equation 1: DSA Signature Computation
Where m is the message to be signed (typically a secure hash of the actual message).
Now, in order to verify the signature given the algorithm parameters g, p and q along with the public key y, the message m (or its secure hash), and the signature (r,s), the receiver computes:
w = s-1 % q
u1 = m * w % q
u2 = r * w % q
v = (gu1yu2 % p) % q
Equation 2: DSA Signature Verification
If the signature is valid, v will equal r.
Elliptic Curve Cryptography
So, Diffie-Hellman provides a solution to the secure key exchange problem, and DSA provides a solution to the man-in-the-middle problem: what's elliptic curve cryptography for? Well, like any encryption technique, DH and DSA are vulnerable to brute-force attacks; the only way to guard against those is to increase the search space. Early Diffie-Hellman key exchanges were 512 bits, but as computers got to be faster and faster, the key exchanges increased to 1024 and now to 2048 bits, and today, it's starting to look like even that's not long enough. As a result, these key exchanges take longer and longer to perform, and use up more and more network and memory bandwidth. Elliptic curve cryptography, rather than being a replacement for Diffie-Hellman and DSA, is a way of performing the same sequences, but using elliptic-curve primitives rather than discrete logarithm primitives. Elliptic-curve primitives are secure with much smaller parameters than discrete logarithm ones: modern elliptic curve key exchanges and signatures are usually done in 256 bits and offer better security than 2048-bit discrete logarithm exchanges.
An elliptic curve (at least as it pertains to cryptography and TLS key exchanges), is defined by
the equation y2 = x3 + ax + b
for some fixed constants
a
and b
: a
and b
define the curve.
This format is useful because it leads to a computationally straightforward solution to the problem
of finding the points that intersect it on a line. In particular, given two points
P1 = (x1,y1)
and P2 = (x2,y2)
, both
of which are on the curve, the line defined by P1, P2
will always intersect the curve
at a third point whose coordinates are given by:
m = (y1 - y2) / (x1 - x2)
x3 = m2 - x1 - x2
y3 = m(x1 - x3) - y1
Equation 3: ECC point addition
But what if P1 = P2
? Well, if you remember your calculus, this might look familiar:
this ends up being the tangent line to the curve at P1
, whose slope is the first
derivative of y2 = x3 + ax + b
:
y' = m = (3x12 + a) / 2y1
Equation 4: ECC Point doubling
So, for notational convenience, if you represent the operation of finding the third point on the
curve given two other points by +
such that P1 + P2 = P3
and consider
P1 + P1 = 2P1 = P3
as doubling, you can meaningfully define multiplication of
a point on an elliptic curve by a scalar, yielding another point.
You can now perform a Diffie-Hellman-like key exchange using elliptic curve primitives. Both
sides first agree on a curve a, b
and a point (on the curve) G. Now the first party
selects a random (large) integer x
and computes a point X' = Gx
. The
second selects another random integer y
and computes Y' = Gy
. They
exchange X'
and Y'
and then compute a third point
Z = GY'
and Z = GX'
respectively. Again, by the distributivity of the
scalar elliptic curve multiplication operation, they'll settle on the same point Z
,
but by the intractability of "dividing" a point by a scalar number on an elliptic curve, again,
an eavesdropper can't recover the secret Z
without either x
or
y
. Since this is as susceptible as ordinary Diffie-Hellman to a man-in-the middle
attack, it similarly must be protected by a digital signature. You can see how DSA can be defined
using elliptic curve primitives in an equivalent way. (I'll show an example below, as well).
This is all theoretically sound, but it breaks down when you try to apply it to an actual
computer, because of the floating-point division operations. To make this computationally
feasibly and numerically stable, all operations are done (a little ironically) modulo p
to ensure that all results are bounded integers that can be represented accurately on a computer.
For the most part, this works out to computing remainders on all operations, but division modulo
p
is a little trickier and involves an incantation of Euclid's extended algorithm.
There are also binary field elliptic curves. The concepts are similar.
An actual ECDH/ECDSA handshake
So, now that the theory is out of the way, I can pick up where I left off last time. I had pointed my Chrome browser to wikipedia and initiated a TLS session. Chrome and Wikipedia, being very modern and forward thinking, both support the newer elliptic curve ciphersuites that are slowly supplanting the older discrete logarithm DH/RSA/DSA equivalents. I reviewed the client and server hello, and the certificate exchange portion of the handshake last month; this month I'll finish the review and go over the key exchange part.
One of the client and server hello extensions that was included in the initial handshake was the elliptic curve point format, and the elliptic curves themselves. I mentioned them in passing in my last post; I'll dig deeper into their meaning here. The client hello included this extension:
000b 0002 0100 // ec_point_formats
There are three possibles types of point formats: uncompressed, compressed prime, and compressed char2. Here, my browser is advertising that it can only parse the uncompressed format. The next EC-related client hello extension lists the curves supported by the browser:
000a 000a 0008 1a1a 001d 0017 0018 // elliptic curves
This client extension lists four named curves that are supported. The first is a deliberately invalid one (GREASE; see my last post), the last two are specified by RFC 4492 as SECP256R1 and SECP384R1; the second, 0x1d, is actually a non-standard specification for Daniel Bernstein's curve 25519 (https://tools.ietf.org/html/draft-ietf-tls-rfc4492bis-17).
After the client hello finished, the server responds with its hello, including its EC-related extension:
000b 0004 0300 0102 // ec_point_formats
The server, unlike the client (Chrome), can parse all formats of points. Since lowest-common denominator wins in a TLS handshake, this means that EC points must be exchanged uncompressed.
The server is required to present a certificate that contains an ECDH-compliant public key. It does, and the tcpdump capture of just that section of the certificate is shown in figure 1, below.
16:41:51.224463 IP text-lb.ulsfo.wikimedia.org.https > 10.0.0.1.55956: Flags [.],
seq 1:1449, ack 206, win 59, options [nop,nop,TS val 420531442 ecr 570453704], length 1448
...
0x01e0: 30 5930
0x01f0: 1306 072a 8648 ce3d 0201 0608 2a86 48ce
0x0200: 3d03 0107 0342 0004 c922 6931 8ad6 6cea
0x0210: dac3 7f2c aca5 afc0 02ea 81cb 65b9 fd0c
0x0220: 6d46 5bc9 1eed b2ac 2a1b 4aec 807b e71a
0x0230: 51e0 dff7 c74a 207b 914b 2007 21ce cf68
0x0240: 658c c69d 3bef d5c1
...
Figure 1: Wikipedia's ECDH public key
This is the DER-encoded ASN.1 representation of the public key contained in the certificate. I've gone ahead and unrolled it in figure 2 below so that you can easily see which parts are which.
30 59
30 13
06 07
2a 86 48 ce 3d 02 01 OID 1.2.840.10045.2.1 (ECDH Public key)
06 08
2a 86 48 ce 3d 03 01 07 OID 1.2.840.10045.3.1.7 (Named curve SECP 256 R1)
03 42
00 04 c9 22 69 31 8a d6 6c ea da c3 7f 2c ac a5 af c0 02 ea 81 cb 65 b9 fd 0c 6d 46 5b c9
1e ed b2 ac 2a 1b 4a ec 80 7b e7 1a 51 e0 df f7 c7 4a 20 7b 91 4b 20 07 21 ce cf 68 65
8c c6 9d 3b ef d5 c1
Figure 2: Unrolled public key
You can see here that the certificate is signed using an ECDSA public key defined over named-
curve SECP-256r1 (aka P-256 or prime256v1), which is defined in
FIPS 186-4. Recall
that an elliptic curve is defined by its a
and b
parameters: in the
case of SECP-256r1, these are:
a = 115792089210356248762697446949407573530086143415290314195533631308867097853948
and
b = 41058363725152142129326129780047268409114441015993725554835256314039467401291
respectively. Further, the named curve definition also specifies a generator point
G
and prime field p
. These are:
G.x = 48439561293906451759052585252797914202762949526041747995844080717082404635286
G.y = 36134250956749795798585127919587881956611106672985015071877198253568414405109
and
p = 115792089210356248762697446949407573530086143415290314195533631308867097853951
You can begin to see the utility in using named curves for most ECDH key exchanges (and, practically speaking, everybody does).
If you look carefully at p
and a
above, you might notice that they're
very close together - in fact a = p - 3
. The NIST documentation actually just
lists a
as -3
for this reason. Also, the prime number p
isn't a random number - it's actually 2256 - 2224 + 2192 +
296 - 1. This formulation makes it relatively quick to compute results modulo
p
and isn't known to weaken its security, although some researchers have their doubts.
An ECDSA public key consists of both a curve specification as well as a point, which is the remainder of the public key specification. The bytes are:
0004c92269318ad66ceadac37f2caca5afc002ea81cb65b9fd0c6d465bc 91eedb2ac2a1b4aec807be71a51e0dff7c74a207b914b200721cecf68658cc69d3befd5c1.
This follows the X92 specification for encoding elliptic curve points; the leading two
bytes, 0004 indicate that the point is "uncompressed" - that is, the remainder of the message is
the X
and Y
values of the point in hexadecimal form. Note that there's
no indication here where those two values are actually split — it's up to the invoker to
recognize that the curve is a 256-bit (32 byte) curve and split the 64 provided bytes in half,
resulting in the point Q:
Q.x = c92269318ad66ceadac37f2caca5afc002ea81cb65b9fd0c6d465bc91eedb2ac
Q.y = 2a1b4aec807be71a51e0dff7c74a207b914b200721cecf68658cc69d3befd5c1
Since the server selected an "ephemeral" ECDH key exchange, it must also, in addition to providing an ECDA-capable public key, initiate an ECDH negotiation. This is shown in figure 3, below.
0x01e0: 16 0303 0073 0c00 006f 0300
0x01f0: 1d20 cdea 986f 12c4 de29 5c1b a896 f976
0x0200: fb3f e86f 6524 79b1 74f8 7478 c251 8c18
0x0210: 2e58 0503 0047 3045 0221 00cd 6fdb d4d4
0x0220: 5c46 6645 54d3 2ad6 bf3f e5db 2248 2105
0x0230: ab25 71f5 6139 42c2 edf5 5502 203b cec5
0x0240: 6088 a521 035a 5ab0 5e76 eeb0 a24e dd67
0x0250: c453 5951 e2d1 8c46 b15f ec48 31
Figure 3: server key exchange
As always, the message is encapsulated by the TLS record protocol which begins with the 0x16 "handshake message" marker, 0303 indicated the version of TLS (3.3) and the length of the packet encapsulated: 0073, or 115 bytes. This, in turn, encapsulates another encapsulated handshake message of type 0x0c — server key exchange — which is itself 00006f (111) bytes long.
The interpretation of these 111 bytes is dependent on the key exchange method selected by the server from amongst those advertised by the client. If you recall from my last post, the Wikipedia server prefers the cipher suite 0xCCA9: ECDHE/ECDSA/CHACHA20/POLY-1305/SHA-2, which instructs the client to expect an ECDHE key exchange as described above. So these 111 bytes must be the server portion of the elliptic-curve Diffie-Hellman key exchange. This key exchange is documented in RFC 4492 which documents how elliptic curve cryptography should be supported in all versions of TLS after 1.0.
03 named curve
001d Curve22519
20 cdea986f12c4de295c1ba896f976fb3fe86f652479b174f87478c2518c182e58 public key (point)
05 SHA-384
03 ECDSA
0047 30 45
02 21 00 cd6fdbd4d45c46664554d32ad6bf3fe5db22482105ab2571f5613942c2edf555
02 20 3bcec56088a521035a5ab05e76eeb0a24edd67c4535951e2d18c46b15fec4831
Figure 4: Unrolled ECDHE server key exchange
The first byte, 0x03, indicates that the exchange takes place on a named curve. The next two bytes are the selected curve - Wikipedia went ahead and selected the (as-yet-not-quite-a-standard) Curve22519 (0x001d). This is followed by the points themselves; as is the standard in TLS, the points are a length-tagged list of 32 bytes here. This is followed finally by the signature; this must be, per the chosen ciphersuite, an ECDSA signature over the entire preceding message. It's worth noting that the server's public key was defined on curve NIST P-256 whereas the DH key exchange takes place on curve 25519; this is OK — in fact, it's perfectly legitimate to sign an ECDH key exchange message with an RSA signature or even a discrete-logarithm DSA signature!
Now, I can't reconstruct the actual key exchange from this captured packet (if I could, it wouldn't be very valuable), but I can reconstruct the actual signature verification, as I should be able to - I have the signature (figure 4) and the public key (figure 2). The first step in verifying the signature is to compute the full SHA-384 hash of the server key exchange message. However, if TLS just computed a signature over the key exchange itself, there would be a replay-attack vulnerability, so instead, TLS requires that the signature be comprised of the client random (from the client hello), the server random (from the server hello) and finally then the server key exchange parameters. Recall from my previous post that the client random was
ec12dd1764a439fd7e8c8546b84d1ea06eb3d7a051f03cb817470d4c54c5df72
and the server random was
0863b887e9d4f726ddd2ce267b06c2e853cfdf22f9ed5f9f1f223d779394c4d8.
Concatenating both of those with the actual key exchange bytes
03001d20cdea986f12c4de295c1ba896f976fb3fe86f652479b174f87478c2518c182e58
(refer to figure 4) and hashing yields an SHA-384 value of
db6a147a3b3660527e6ff3c2a2ff8703f22bc489e3ddce30debd2a6be34d8d75dedc157a9d61b20a08453cd89d9c375f (33770963132048901302264783418031951708134917191971463922198731009462300548954228193978041877862843625376343471699807 decimal).
However, curve NIST P-256 only calls for 256 bits, so the right-most 128 bits of the hash are discarded, yielding a "message" to be verified of
99243940958878071300191159622499466292940064902506857459236582090833289645429.
The first step in verifying a ECDSA signature (refer back to equation 2) is to compute w = s-1 % n
,
where n is the "multiplicative order" of the point G. This is actually given in the curve
definition and, for NIST curve SECP256R1, this is
115792089210356248762697446949407573529996955224135760342422259061068512044369.
Recall from the discussion of Diffie-Hellman key exchanges that some g values are worse for
some p values: here n is the analogous point at which multiples of G
begin to "wrap around" and repeat themselves.
Notice that the computation of w
doesn't involve any elliptic curve primitives — it's just an
"ordinary" modular inversion, just as in the discrete logarithm DSA algorithm. This works out to
37225760650604523837084352964697552728363781776156407210184368793453409371072;
that is - if you were inclined to multiply
s = 27051790808332663653363352418049438771671943517064925808794401845951570135089
by
w = 37225760650604523837084352964697552728363781776156407210184368793453409371072
and then compute the remainder mod n =
115792089210356248762697446949407573529996955224135760342422259061068512044369,
you'd find that the result was 1. (I'd recommend using a computer to do this rather than verifying it with pencil and paper).
The next two steps, the computation of u1
and u2
mirror the ordinary DSA algorithm as well, although the operations are performed modulo n
rather than q. These work out, in this case, to:
u1 = m * w % n =
99243940958878071300191159622499466292940064902506857459236582090833289645429 *
37225760650604523837084352964697552728363781776156407210184368793453409371072 %
115792089210356248762697446949407573529996955224135760342422259061068512044369 =
87106447593397350023313142143945664073790786496668217711780480248243986088376
u2 = r * w % n =
92921771204082816412700269685987763875016055346087116001910284437334082319701 *
37225760650604523837084352964697552728363781776156407210184368793453409371072 %
115792089210356248762697446949407573529996955224135760342422259061068512044369 =
43586422621027958961411777669632921203197893654670701799164673320493277884031
So far, this is all ordinary (but long) arithmetic. The next, and final, step, finally gets around
to introducing some actual elliptic curve cryptography. Given u1
and
u2
in ordinary DSA, the verifier computes v = (gu1yu2 % p) % q
, where
g is the DSA generator and y is the public key. In ECDSA, the exponentiation
operations are replaced by elliptic-curve multiplication (defined in the funny way ECC defines
multiplication; see equations 3 and 4 above), g is replaced by the curve's "generator point" and y is
replaced by the public key point.
If you've had the intestinal fortitude to follow along to this point, recall that the generator
point of SECP256R1 is given by:
G.x = 48439561293906451759052585252797914202762949526041747995844080717082404635286
G.y = 36134250956749795798585127919587881956611106672985015071877198253568414405109
and the public key provided in the certificate is:
Q.x = 90975681384464114597415250557880658321503740381075882745442172635142402454188
Q.y = 19045361616554224604463094358662843637415973185789974047480701659820650911169
Computing out the whole behemoth u1 * G + u2 * Q
yields the
point whose coordinates are given by:
X = 92921771204082816412700269685987763875016055346087116001910284437334082319701
Y = 88510856270022347658898086211434764712923146284483283458255291275575836722305
Whose X value, if you scroll up and take a peek at figure 4, matches the provided r
value in the
signature, which means that the signature verifies. (Y is irrelevant).
Recall, though, that the Diffie-Hellman handshake isn't complete; the client has received the
server's Y'
point, but it must in turn compute a random X
and multiply
that by the generator. The tcpdump capture of the portion of the SSL handshake that covers the
client key exchange is illustrated in figure 4, below.
16:41:51.225874 IP 10.0.0.1.55956 > text-lb.ulsfo.wikimedia.org.https: Flags [P.],
seq 206:291, ack 4893, win 4096, options [nop,nop,TS val 570453748 ecr 420531442], length 85
...
0x0040: 1603 0300 2510 0000 2120 76f4 4ddf
0x0050: 6faa cfa7 e230 1a05 8aa5 5590 3fcf 6f2b
0x0060: 5b90 1b93 52a4 1fff 1f83 f14e
...
Figure 4: Client Key Exchange message
The relevant portion of which is shown in figure 5:
20 76f44ddf6faacfa7e2301a058aa555903fcf6f2b5b901b9352a41fff1f83f14e
Figure 5: Unrolled client key exchange
Notice that this isn't signed — in fact, it couldn't be, since there's nothing with which to sign it. This is OK - a man in the middle has to be able to fake both sides in order to complete an attack.
The client key exchange consists of a single point - the point that resulted when the client multiplied its (secret) scalar value with the generating point on the named curve. Now the client multiplies its scalar value with the point received by the server, and the server multiplies its scalar value with the value received by the client, and the two will have, by the nature of the ECDH algorithm, have arrived at the same point, which is used as the key for the selected bulk-encryption cipher suite. All subsequent messages are encrypted using this key, and, if all went correctly, will be impossible for an eavesdropper to decipher.
In my next post, I'll go through the actual certificate exchange.
Add a comment:
Completely off-topic or spam comments will be removed at the discretion of the moderator.
You may preserve formatting (e.g. a code sample) by indenting with four spaces preceding the formatted line(s)
You've written practically THE book for the ongoing network security and cryptography students.
Beside this I think a significant number of customer specific TLS 1.2 implementations would be at least more difficult when not impossible without your book.
What do you think, is this not the proper moment to issue an update?
Best regards
Viktor Morstein
Further I have questions regarding the exact use of of the output of the x25519 based ECDHE key change. I understand that the output forms a 32 byte premaster secret that is fed to the PRF function, which produces the 48 byte master secret. Are the bytes for the 32 byte premaster secret reordered to big-endian representation and are leading zeros removed? (RFC8446 (TLS1.3) section 7.4.1 states that Z is converted to a byte stream in big-endian form and left padded with zeros up to the size of the prime (32 bytes, I guess) but it states that this construction differs from previous versions of TLS.)
Firstly, thanks for the fantastic explanation in these series of yours detailing TLS message.. Its helping a lot to improve my understanding.. Can I request you to share the complete dump of all the messages (put together in a pcap file) so we can correlate much faster ?
-Nag