Pwn MBedTLS on ESP32: DFA Warm-up

Introduction

ESP32 is a System-on Chip (SoC) from Espressif Systems, launched in 2016. This SoC will be supported until 2028 (12 years longevity commitment) and has already been shipped more than 100 Millions times around the world.

ARM MbedTLS is a the open source crypto-library from ARM, used in IoT devices.

In my opinion, both are quite valuable targets to pwn.

The Targets

ARM MbedTLS v2.13.1

MbedTLS version 2.13.1 is a an open source crypto-library developed by ARM. It was designed to fit into embedded devices it works on most operating systems and architectures without any trouble.

More info on the ARM website here.

Finding vulnerabilities into this crypto-library allows an attacker to exploit them on each platform where the crypto-lib is implemented.

In this work, I will focus on the implemented version of algorithm AES-128 (FIPS 197) as Proof of Concept.

ESP32 platform

ESP32 is a SoC designed by Espressif Systems and manufactured by TSMC (techno node 40nm). Datasheet is available here, Reference Manual here.

ESP32

It comes generally through its dedicated module EPS-WROOM-32 (more info here).

ESP-WROOM-32 module

ESP32 dev-kits are available everywhere (Amazon, Ebay, ALiexpress…) This one should be good. LOLIN32 here for 10 euros.

LOLIN32 development-kit.

The LOLIN32 schematics are available here.

Software preparation

ESP-IDF

The development framework ESP-IDF can be found here. They provide also a lot of examples, a complete xtensa-toolchain and a good programming guide.

My way to run MbedTLS (dirty)

To run an AES encryption (ECB mode) using the crypto library MbedTLS on EPS32, I simply copy the aes.c from the MbedTLS folder here. I include some UART functions from here and I write code into a main function.

Below is the different functions needed to encrypt a data block:

void app_main(){
...
mbedtls_aes_context aes;
 char key[]= "\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61\x61";
 unsigned char input[16];
 unsigned char output[16];
 gpio_pad_select_gpio(GPIO_NUM_26);
 gpio_set_direction(GPIO_NUM_26, GPIO_MODE_OUTPUT);
...
mbedtls_aes_init(&aes);
mbedtls_aes_setkey_enc( &aes, (const unsigned char*) key, strlen(key) * 8 );
gpio_set_level(GPIO_NUM_26, 1); //gpio high
mbedtls_aes_crypt_ecb(&aes, MBEDTLS_AES_ENCRYPT, (const unsigned char*)input, output);
gpio_set_level(GPIO_NUM_26, 0); //gpio low
mbedtls_aes_free(&aes);
...
}

That’s all, quite simple piece of code with an hardcoded key.

I disable the HW crypto acceleration using make menuconfig (or just by editing the line CONFIG_MBEDTLS_HARDWARE_AES= into the sdkconfig file. For this first investigation, I want to be sure the CPU only will process the crypto-computation (and not the ESP32 HW accelerator).

Teaser: Don’t worry, the crypto-core will be the pwned in the next post.

Then I compile and flash the board using make flash command.

In the code above, I also use the GPIO26 to identify the AES encryption in the main function. This function takes 440us, according to the scope screenshot:

mbedtls_aes_crypt_ecb() defined by GPIO26 (CH4). 100us/div.

This duration is only true the first time the mbedtls_aes_crypt_ecb is called.

The funny thing here. If the function is called once again (whatever the data input), the AES encryption is significantly executed faster (approximately 50us). This is due to AES S-boxes init and CPU caches optimisation.

CPU Frequency

ESP32 CPU frequency is by default 160MHz. I do not modify this number into the config file.

Hardware preparation

Voltage glitching intro

Voltage glitching also called voltage fault injection, is a well-known technique to hack embedded devices.

The goal is to influence the target power supply during the desired SW/HW operations to inject a ‘fault’ into the process and modify the code or data executed by the CPU.

Generally, the under/over voltage perturbations on VDD have to be very short (about hundred of nanoseconds). Otherwise, the chip will reset due to CPU interrupts, memory crashes or even build-in sensors sending shutdowns signal.

Datasheet reverse (RTFO)

Voltage glitching begins by digging into the Datasheet and TRM. It is full of useful information.

Power domains. ESP32 is a SoC with three different power domains:

SoC power domains (ESP32 datasheet)

It is important to choose the right one. Here, the ESP32 design is a little bit ‘exotic’ because the CPU shares two input power supplies (VDD3P3_RTC) and (VDD3P3_CPU).

Low Drop-Out regulators (LDOs) are present on each separate power domain. With the current level of information, it is impossible to say if they will ‘filter’/ protect against the Voltage glitches. Let’s see.

Only one Brownout Detector (BOD) checks the voltage of pin VDD3P3_RTC. According to the TRM 30.3.5 chapter, if the BOD detects a voltage drop, it will trigger a signal shutdown and even send a ‘message’ on UART0. It looks like that (I did the test of course):

Brownout Detector was triggered 

ets Jun  8 2016 2016 00:22:57 
...

ESP32 resets when BOD is triggered. It is clear now. For this reason, I finally decide to focus on VDD3P3_CPU and to only work on this power line.

PCB Reverse

Perhaps reversing a PCB with open source schematics cannot be called PCB ‘reverse’…;)

Here is the schematic of the WROOM-32 module:

ESP-WROOM32 schematic

First thing to do is to remove the shield on the top of the module.

The components (resistors, capas, ICs…) are not silkscreened but no big deal. The capacitors between VDD and GND are removed one by one, except the one on the VDDSDIO line.

Let’s modify the PCB now.

My interest goes particularly to VDD3P3_CPU, because this VDD line is feeding the CPU. I start by scratching a little area using a scalpel:

Vdd line metal is now exposed

Then, I cut this line carefully:

No CTRL-Z here

Finally, a magnet wire (AWG36) is soldered to the VDD3P3_CPU (chip side), and a second wire (the black one) for GND.

It should be good…

These two wires are first connected to a lab power supply providing 3.3V, to confirms this little “surgery” did not affect the normal behaviour.

Of course, USB cable has to be plugged too (to supply the others power lines VDD3P3_RTC, VDD_A and VDD_SDIO).

The ESP32 boots normally. He’s alive:

ets Jun  8 2016 2016 00:22:57
rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) 
configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:592
ho 0 tail 12 room 4 
load:0x40078000,len:7912 
ho 0 tail 12 room 4 
load:0x40080400,len:5620 
entry 0x40080670

My low-cost glitcher

It is now time to connect the VDD3P3_CPU signal to a glitch board, aka glitcher.

Of course, some commercial solutions exist but I cannot use this kind of equipment. Too expensive, too complex…do not forget I am Limited.

So, I built a board based on a MAX4619, to simply switch from normal voltage reference to GND, similar for example to this presentation from @akacastor here.

The glitcher is then supply by a laboratory power supply. It can produce under-voltage or over-voltage perturbations. The Amplitude values depend on the two voltage inputs, previously set.

The glitch parameters, Delay and Width, are set by a Rigol DG1022Z signal generator.

A Rigol DS1054 scope is used to synchronise the glitcher according to the ESP32 activity.

Looking the effect, it works as intended:

Glitch effect on VDD3P3_CPU (yellow). 200ns/div.

The final setup

Python scripting

All the equipments (scope, power supply, and glitcher) and the serial COM between the PC to the ESP32 are managed via a Python script.

The script is doing the following steps:

  • set the glitch delay and width (via the pulse generator)
  • arm the scope
  • send the plaintext data
  • receive the ciphertext data (and analyse it)
  • reset the ESP32 in case of crash
  • store log files and data for crypto analysis

At the end, the setup is fully automatised and it can run during days.

Here is a diagram to understand the complete setup:

Setup diagram

For real:

Let’s pwn this crypto-lib..

It is time to obtain results, I would say.

Note: I should create a repo on my GitHub with all the files and ressources.

Fault Sessions

Several ‘campaigns’ are launched to adjust the glitch parameters.

A normal AES encryption looks like that from the shell:

----- Cipher 2149 ----- Pulse delay = 0.000347500
key   : 61616161616161616161616161616161  
plain : 30303030303030303030303030303030  
cipher: c7fa6283f707ec9e55b6dd900bdb0bc1  
verif AES OK 

Sometimes, ESP32 can react to voltage glitching producing crashes error dumps like:

Interrupt wdt timeout on CPU0Core0 register dump:
 : 0x4012PS      06f4  : 0x8023A1      fb00  A2      a4ac  : 0x3428A4      fb19  : 0x3428
 : 0x3428A7      f030  : 0x3f4bA9      f050  A10     000c  : 0x9b4aA12     f0c0  : 0x3428
 : 0x25f1A15     d852  : 0x0000EXCCAUSE0005  EXCVADDR0000  : 0x401fLEND    005d  : 0xffff
  0x4012fb000d4d:0x3f44 0x4003fb10
 PC      0d30  : 0x0003A0      0898  : 0x3f56
 : 0x0002A3      0001  : 0x0002A5      fbe8  A6      fb10  : 0x3f1eA8      fbe4  : 0x3f54
 : 0x0000A11     0623  : 0x0002A13     0001  A14     00bb  : 0x0000SAR     0000  : 0x0000
 : 0x0000LBEG    0000  : 0x0000LCOUNT  0000  
 Backtrace:0d30:0x3f56 0x4049fb00
 Rebooting…

By the way, as a black-box hacker, these dumps are very good hints when you perform fault injections. It means the glitch affects very ‘badly’ the CPU.

And finally, it is just a matter of time to see some faulted cipher texts:

...
----- Cipher 2148 ----- Pulse delay = 0.000347450 
 key   : 61616161616161616161616161616161
 plain : 30303030303030303030303030303030
 cipher: adcc95b4fd48a813f218caaf5251c6b8
!!!! AES core Pwned !!!!
----- Cipher 2150 ----- Pulse delay = 0.000347550 
 key   : 61616161616161616161616161616161
 plain : 30303030303030303030303030303030
 cipher: d4e88b9f84182540e2be887f124ac836
 ----- Cipher 2159 ----- Pulse delay = 0.000348000 
 key   : 61616161616161616161616161616161
 plain : 30303030303030303030303030303030
 cipher: c69b5e68494ed72c5f64df5a424f3712
!!!! AES core Pwned !!!!
 ----- Cipher 2214 ----- Pulse delay = 0.000350750 
 key   : 61616161616161616161616161616161
 plain : 30303030303030303030303030303030
 cipher: c7176283e407ec9e55b6dd990bdb32c1
!!!! AES core Pwned !!!!
 ----- Cipher 2288 ----- Pulse delay = 0.000354450 
 key   : 61616161616161616161616161616161
 plain : 30303030303030303030303030303030
 cipher: 90fa6283f707ec2e55b6da900bf60bc1
!!!! AES core Pwned !!!!
----- Cipher 2294 ----- Pulse delay = 0.000354750 
 key   : 61616161616161616161616161616161
 plain : 30303030303030303030303030303030
 cipher: c7976283ec07ec9e55b6dd030bdbeec1
!!!! AES core Pwned !!!! 
...

All faulted outputs are sorted and stored into different result files by the script, in order to be analysed by the DFA tools.

Typically here, the last three faulted cipher outputs have been faulted at Round 9 (according to the four error bytes diffusion). These faults can be used to apply DFA (but maybe I can find more valuable results).

And indeed, the first three faulted outputs have been faulted before Round 9, probably at Round 8 (according to the close timing and the full error diffusion). These faulted outputs are privileged to apply DFA (because only two faulted outputs are needed to recover the key with this fault model).

DFA results

Theory

Here, I will not explain DFA principle. It is a well-known attack in the infosec community. The best is to refer to academic papers or previous articles, such as listed below:

On the Importance of Checking Cryptographic Protocols for Faults, D. Boneh, R. A. Demillo and R. J. Lipton, 1997

DFA on AES, P.Dusart, G. Letourneux, O. Vivolo, 2002

A Differential Fault Attack Technique against SPN Structures, with Application to the AES and KHAZAD, G. Piret and J.J Quisquater, 2003

Attacking White-Box implementations using DFA, Quarkslabs, 2016

All the standard implementations of AES, DES, RSA, ECC… are DFA-vulnerable. Powerful technique, indeed.

DFA in practice

A nice and useful project is the doegox’s one, called Side Channel Marvels.

The python lib phoenixAES (in JeanGrey repo), is used to perform DFA. In case of successful attack, that will display the last round sub-key K10:

$ python3 test-dfa-esp.py 
Last round key #N found:
DE95406B42B516C29392CBC111E47369

with the content of test-dfa-esp.py:

#!/usr/bin/env python3
import phoenixAES
with open("r8faults", "w") as f:
    f.write("c7fa6283f707ec9e55b6dd900bdb0bc1\n") #correct
    f.write("adcc95b4fd48a813f218caaf5251c6b8\n") #faulted 1
    f.write("d4e88b9f84182540e2be887f124ac836\n") #faulted 2
phoenixAES.convert_r8faults_file("r8faults", "r9faults")phoenixAES.crack_file("r9faults")

AES key schedule is fully invertible. It means once you have one of the sub-key, you can retrieve the user secret key. The aes_keyschedule tool is perfect for that:

$ ./Stark/aes_keyschedule DE95406B42B516C29392CBC111E47369 10
K00: 61616161616161616161616161616161
K01: 8F8E8E8EEEEFEFEF8F8E8E8EEEEFEFEF
K02: 525151A6BCBEBE49333030C7DDDFDF28
K03: C8CF65677471DB2E4741EBE99A9E34C1
K04: CBD71DDFBFA6C6F1F8E72D18627919D9
K05: 6D032875D2A5EE842A42C39C483BDA45
K06: AF5446277DF1A8A357B36B3F1F88B17A
K07: 2B9C9CE7566D344401DE5F7B1E56EE01
K08: 1AB4E0954CD9D4D14D078BAA535165AB
K09: D0F982789C2056A9D127DD038276B8A8
K10: DE95406B42B516C29392CBC111E47369

I verify with a python script highlighting the AES intermediate values:

$ python myaes.py
round[ 0].key  : 61616161616161616161616161616161
round[ 0].in   : 30303030303030303030303030303030
round[ 1].k_sch: 8f8e8e8eeeefefef8f8e8e8eeeefefef
round[ 2].k_sch: 525151a6bcbebe49333030c7dddfdf28
round[ 3].k_sch: c8cf65677471db2e4741ebe99a9e34c1
round[ 4].k_sch: cbd71ddfbfa6c6f1f8e72d18627919d9
round[ 5].k_sch: 6d032875d2a5ee842a42c39c483bda45
round[ 6].k_sch: af5446277df1a8a357b36b3f1f88b17a
round[ 7].k_sch: 2b9c9ce7566d344401de5f7b1e56ee01
round[ 8].k_sch: 1ab4e0954cd9d4d14d078baa535165ab
round[ 9].k_sch: d0f982789c2056a9d127dd038276b8a8
round[10].k_sch: de95406b42b516c29392cbc111e47369
output         : c7fa6283f707ec9e55b6dd900bdb0bc1

That’s confirm the key is now recovered with only two faulted cipher outputs.

Side channel Marvels on the field!

Quite magic 🙂

Conclusion

In this post, a complete voltage glitching attack has been presented on ESP32, running MbedTLS library AES encryption.

Consequently, the AES secret key can be retrieved with only 2 faulted cipher outputs, using Differential Fault Analysis.

This attack is more than 17 years old but still very effective. The lib implementation is just completely unprotected and the platform too.

Even if this PoC was achieved on ESP32, this attack can be reproduced easily on other platforms running MbedTLS.

The next post will focus on the ESP32’s crypto accelerator vulnerabilities, stay tuned.

Timeline Disclosure

20/04/2019: E-mail sent to Espressif Systems and ARM MbedTLS.

13/05/2019: No answer. Posted.

14/05/2019: E-mail from Espressif. They first processed the report as a SPAM but now, as if by magic, they have forwarded to their engineers.

21/05/2019: ARM finally answered ” We will have a look into this“.

23/05/2019: Espressif answered “We do not consider bugs which have already been published to be eligible. As a result, we do not consider software DFA findings eligible“.

23/05/2019: I respect their choice. I don’t work for them. Let’ pwn harder next time 🙂

6 Replies to “Pwn MBedTLS on ESP32: DFA Warm-up”

  1. I’m gonna try the vulns you’ve discovered but I don’t have an oscilloscope and but I do have a Chipwhisperer kit 1(https://www.mouser.in/new/newae-technology/newae-chipwhisperer-lite-l1-kit/) I’m about to order an esp32 was thinking for this(https://robu.in/product/esp-wroom-32-esp32-wifi-bt-ble-mcu-module/).

    Can you confirm that the above esp32s dev kit will work? As it uses the same wroom32 SoC.

    Also, what oscilloscope, signal generator & voltage glitch would you recommend? As I have the chipwhisperer kit will it suffice all of the needs?

    Waiting for your reply.
    Regards.

    1. The sampling frequency of CW oscilloscope is very low. It is suggested to use better oscilloscope as monitoring. Pioscope is a good choice.

  2. Hello,
    I can’t find anything on your Github. as you say:”Note: I should create a repo on my GitHub with all the files and ressources.”

    Can you show me your script?

    Waiting for your reply.
    Regards.

Leave a Reply

Your email address will not be published. Required fields are marked *