supermicro / ikvm / sslv3 alert bad certificate
Today I was asked to look at a machine that disallowed iKVM IPMI console access. It allowed access through the “iKVM/HTML5”, but when connecting using the “Console Redirection” (Java client, see also ipmikvm) it would quit after 10 failed attempts.
TL;DR: The clock of the machine had been reset to a timestamp earlier than the first validity of the supplied client certificate. After changing the BMC time from 2015 to 2021, everything worked fine again.
If you're interested, here are some steps I took to debug the situation:
Debugging
My attempts at logging in started like this:
$ ipmikvm 10.1.2.3
...
Retry =1
Retry =2
Retry =3
Okay, no good. Luckily syslog had some info (some lines elided):
Service [ikvm] accepted (FD=3) from 127.0.0.1:40868
connect_blocking: connected 10.1.2.3:5900
Service [ikvm] connected remote server from 10.0.0.2:38222
Remote socket (FD=9) initialized
SNI: sending servername: 10.1.2.3
...
CERT: Locally installed certificate matched
Certificate accepted: depth=0,
/C=US/ST=California/L=San Jose/O=Super Micro Computer
/OU=Software/CN=IPMI/emailAddress=Email
SSL state (connect): SSLv3 read server certificate A
...
SSL state (connect): SSLv3 write client certificate A
...
SSL state (connect): SSLv3 write finished A
SSL state (connect): SSLv3 flush data
SSL alert (read): fatal: bad certificate
SSL_connect: 14094412: error:14094412:
SSL routines:SSL3_READ_BYTES:sslv3 alert bad certificate
Connection reset: 0 byte(s) sent to SSL, 0 byte(s) sent to socket
Remote socket (FD=9) closed
Local socket (FD=3) closed
Service [ikvm] finished (0 left)
The Java iKVM client uses an stunnel sidecar to take care of the TLS bits, hence the extra connection to 127.0.0.1. Right now, that's not important. What is important is the “SSL routines:SSL3_READ_BYTES:sslv3 alert bad certificate” message. Apparently someone disagrees with something in the TLS handshake.
First question: is the client or the server to blame? The log isn't totally clear on that. Let's find out who disconnects.
Rerunning ipmikvm 10.1.2.3
but with sh -x
shows us how to invoke the
client:
$ sh -x /usr/bin/ipmikvm 10.1.2.3
...
+ dirname /home/walter/.local/lib/ipmikvm/iKVM__V1.69.38.0x0/iKVM__V1.69.38.0x0.jar
+ exec java -Djava.library.path=/home/walter/.local/lib/ipmikvm/iKVM__V1.69.38.0x0 \
-cp /home/walter/.local/lib/ipmikvm/iKVM__V1.69.38.0x0/iKVM__V1.69.38.0x0.jar \
tw.com.aten.ikvm.KVMMain 10.1.2.3 RXBoZW1lcmFsVXNlcg== TGlrZUknZFRlbGxZb3U= \
null 63630 63631 0 0 1 5900 623
We can rerun that one and see what it does, by running it through
strace, and redirecting the syscalls to tmp.log
. (The output is
large enough not to want it on your console, trust me.)
$ strace -s 8192 -ttf \
java -Djava.library.path=/home/walter/.local/lib/ipmikvm/iKVM__V1.69.38.0x0 \
-cp /home/walter/.local/lib/ipmikvm/iKVM__V1.69.38.0x0/iKVM__V1.69.38.0x0.jar \
tw.com.aten.ikvm.KVMMain 10.1.2.3 RXBoZW1lcmFsVXNlcg== TGlrZUknZFRlbGxZb3U= \
null 63630 63631 0 0 1 5900 623 \
2> tmp.log
connect failed sd:18
Retry =1
^C
We expect a tcp connect()
to 10.1.2.3
:
[pid 130553] 08:50:30.418890 connect(
9, {sa_family=AF_INET, sin_port=htons(5900),
sin_addr=inet_addr("10.1.2.3")}, 16)
= -1 EINPROGRESS (Operation now in progress)
Not quite connected yet, but it's apparently non-blocking. Scrolling downward in pid 130553 we see:
[pid 130553] 08:50:30.419037 poll(
[{fd=9, events=POLLIN|POLLOUT}], 1, 10000)
= 1 ([{fd=9, revents=POLLOUT}])
Good, it's connected. Now following the read/write
(usually
recv/send
or a similar syscall, but not this time) on fd 9 shows us:
[pid 130553] 08:50:30.684609 read(
9, "\25\3\3\0\2", 5) = 5
[pid 130553] 08:50:30.684664 read(
9, "\2*", 2) = 2
[pid 130553] 08:50:30.684775 sendto(
6, "<31>Jun 17 08:50:30 stunnel: LOG7[130546:139728830977792]:
SSL alert (read): fatal: bad certificate",
99, MSG_NOSIGNAL, NULL, 0) = 99
...
[pid 130553] 08:50:30.685333 close(9) = 0
So, the client is the one closing the connection after receiving "\2"
(2) and "*"
(0x2A, 42 decimal).
OpenSSL errors
We can go back in the strace output and look for certificates used:
[pid 130519] 08:50:29.203145 openat(
AT_FDCWD, "/tmp/1623912629200/client.crt",
O_WRONLY|O_CREAT|O_TRUNC, 0666) = 18
[pid 130519] 08:50:29.203497 write(
18, "-----BEGIN CERTIFICATE-----\nMIID7TCCAt...) = 1424
The sidecar setup writes a client.crt
, client.key
and server.crt
.
If we extract their contents from the strace output and write them to a file, we can use the openssl s_client to connect directly and get additional debug info:
$ openssl s_client -connect 10.1.2.3:5900 -servername 10.1.2.3 \
-showcerts -debug
...
read from 0x55bbada227a0 [0x55bbada2a708]
(2 bytes => 2 (0x2))
0000 - 02 28 .(
140555689141568:error:14094410:SSL routines:ssl3_read_bytes:
sslv3 alert handshake failure:../ssl/record/rec_layer_s3.c:1543:
SSL alert number 40
So, not supplying a client certificate gets us an error 40 (0x28), followed by a disconnect from the server (read returns -1). This is fine. Error 40 (handshake_failure) means that one or more security parameters were bad. In this case because we didn't supply the client certificate.
What happens if we send a self-generated client certificate?
$ openssl s_client -connect 10.1.2.3:5900 -servername 10.1.2.3 \
-showcerts -debug -cert CUSTOMCERT.crt -key CUSTOMCERT.key
...
read from 0x5604603d7750 [0x5604603dcd68]
(2 bytes => 2 (0x2))
0000 - 02 30 .0
139773856281920:error:14094418:SSL routines:ssl3_read_bytes:
tlsv1 alert unknown ca:../ssl/record/rec_layer_s3.c:1543:
SSL alert number 48
Error 48 (unknown_ca). That makes sense, as the server does not know the CA of our custom generated certificate.
Lastly with the correct certificate, we get an error 42 (0x2A):
$ openssl s_client -connect 10.1.2.3:5900 -servername 10.1.2.3 \
-showcerts -debug -cert client.crt -key client.key
...
read from 0x556b27ca7cd0 [0x556b27cad2e8]
(2 bytes => 2 (0x2))
0000 - 02 2a .*
140701791647040:error:14094412:SSL routines:ssl3_read_bytes:
sslv3 alert bad certificate:../ssl/record/rec_layer_s3.c:1543:
SSL alert number 42
Error 42 is bad_certificate
, with this description from RFC 5246
(7.2.2):
bad_certificate A certificate was corrupt, contained signatures that did not verify correctly, etc.
We're now quite certain it's our client certificate that is being
rejected. But we're no closer to the reason why. If we openssl-verify
client.crt
locally, it appears to be just fine.
Upgrading and inspecting the BMC firmware
This particular motherboard — X10SRD-F — already had the latest
Firmware according to the SuperMicro BIOS IPMI
downloads:
REDFISH_X10_388_20200221_unsigned.zip
As a last ditch effort, we checked if we could upgrade to a newer version. After all, in the changelog for version 3.90 (and similar for 3.89) it said:
7. Corrected issues with KVM console.
Ignoring the fact that version 3.89 was not listed for our hardware, we went ahead and upgraded to 3.89. That went smoothly, but the problem persisted.
Upgrade to 3.90 then? Or maybe there is something else we're overlooking. Let's see if we can disect the firmware:
$ unzip ../REDFISH_X10_390_20200717_unsigned.zip
Archive: ../REDFISH_X10_390_20200717_unsigned.zip
inflating: REDFISH_X10_390_20200717_unsigned.bin
...
$ binwalk REDFISH_X10_390_20200717_unsigned.bin
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
103360 0x193C0 CRC32 polynomial table, little endian
4194304 0x400000 CramFS filesystem, little endian, size: 15253504, version 2, sorted_dirs, CRC 0x4148A5CC, edition 0, 8631 blocks, 1100 files
20971520 0x1400000 uImage header, header size: 64 bytes, header CRC: 0xC3B2AF42, created: 2020-07-17 09:02:52, image size: 1537698 bytes, Data Address: 0x40008000, Entry Point: 0x40008000, data CRC: 0x4ACB7660, OS: Linux, CPU: ARM, image type: OS Kernel Image, compression type: gzip, image name: "21400000"
20971584 0x1400040 gzip compressed data, maximum compression, has original file name: "linux.bin", from Unix, last modified: 2020-07-17 08:56:49
24117248 0x1700000 CramFS filesystem, little endian, size: 7446528, version 2, sorted_dirs, CRC 0x1D3A953F, edition 0, 3089 blocks, 456 files
4194304 and 24117248 are both multiples of 4096 (0x1000) (obvious from
the zeroes in the hexadecimal column), this speeds up this dd
step a
bit:
$ dd if=REDFISH_X10_390_20200717_unsigned.bin \
bs=$(( 0x1000 )) skip=$(( 0x400000 / 0x1000 )) \
count=$(( (0x1400000 - 0x400000) / 0x1000 )) \
of=cramfs1.bin
$ dd if=REDFISH_X10_390_20200717_unsigned.bin \
bs=$(( 0x1000 )) skip=$(( 0x1700000 / 0x1000 )) \
of=cramfs2.bin
$ du -b cramfs*
16777216 cramfs1.bin
9437184 cramfs2.bin
We can mount these and inspect their contents:
$ sudo mkdir /mnt/cramfs{1,2}
$ sudo mount -t cramfs ./cramfs1.bin /mnt/cramfs1
$ sudo mount -t cramfs ./cramfs2.bin /mnt/cramfs2
cramfs1.bin
contains a Linux filesystem with an stunnel
configuration:
$ ls /mnt/cramfs1/
bin linuxrc proc SMASH var
dev lost+found run sys web
etc mnt sbin tmp
lib nv share usr
$ ls /mnt/cramfs1/etc/stunnel/
client.crt server.key
server.crt stunnel.conf
This also looks sane. The server.key
matches the server.crt
we
already had. And the client.crt
also matches what we had. And any and
all validation on these just succeeds.
cramfs2.bin
then?
$ ls /mnt/cramfs2/
cgi
cgi-bin
css
extract.in
iKVM__V1.69.38.0x0.jar.pack.gz
...
This looks like the webserver contents on https://10.1.2.3
.
(Interestingly the iKVM__V1.69.38.0x0.jar.pack.gz
file differs between
3.88 and 3.89 and 3.90, but that turned out to be of no significance.)
Peering into the jar yielded no additional clues unfortunately:
$ unpack200 /mnt/cramfs2/iKVM__V1.69.38.0x0.jar.pack.gz iKVM__V1.69.38.0x0.jar
$ unzip iKVM__V1.69.38.0x0.jar
Archive: iKVM__V1.69.38.0x0.jar
PACK200
inflating: META-INF/MANIFEST.MF
...
$ ls res/*.crt res/*.key
res/client.crt
res/client.key
res/server.crt
Same certificates. Everything matches.
Date and Time
At this point I'm giving up. I had tried the Syslog option in the BMC, which gave me 0 output thusfar. I had tried replacing the webserver certificate. Upgrading the BMC...
Out of ideas I'm mindlessly clicking around in the web interface. This landed me at Configuration -> Date and Time. Apparently the local date was set to somewhere in the year 2015.
We might as well fix that and try one last time.
Yes! After fixing the date, connecting suddenly worked.
Immediately all pieces fit together:
$ openssl x509 -in client.crt -noout -text |
grep Validity -A3
Validity
Not Before: May 19 09:46:36 2016 GMT
Not After : May 17 09:46:36 2026 GMT
Subject: C = US, ST = California, L = San Jose,
O = Super Micro Computer, OU = Software,
CN = IPMI, emailAddress = Email
Crap. The server had been looking at a “not yet valid” certificate the whole time. The certificate would be valid between 2016 and 2026, but the server was still living in the year 2015.
I wonder why stunnel/openssl did not send error 45 (certificate_expired). After all, the RFC says “a certificate has expired or is not currently valid” (my emphasis). That would've pointed us to the cause immediately.
This problem was one giant time sink. But we did learn a few things about structure of the BMC Firmware. And, also important: after the 17th of May in the year 2026, the iKVM connections will stop working unless we upgrade the firmware or fiddle with the time.
Maybe set a reminder for that event, in case these servers are still around by then...