This report describes my work for the Computer Security Project at the Technical University of Berlin. The results of my experiments were used in the paper Strong Machine Learning Attack against PUFs with No Mathematical Model and presented at CHES 2016.
Motivation and background for this field of research
Building a hardware product that cannot be copied is hard. Especially small integrated chips make it hard to distinguishing between a knockoff device and a real one. But this is not only a copyright concern, but also important to ensure trust in a device's origin. For example a chip could be replaced in the manufacturing chain with a backdoored version. A good example to understand the problem is to look at smart cards, especially the ones used for decrypting premium TV channels. The whole business model relies on a shared secret key, embedded inside of the chips. It's obviously in the interest of the company, that nobody can crate a working copy of such a card. But once the secret key is extracted through various hardware attack techniques, creating a copy of a smart card is trivial.
Is there a way how hardware could be manufactured, such that it's practically impossible to copy - and not just very expensive to copy?
Physically Unclonable Function
Physically Unclonable Function (short: PUF) is a concept that attempts to exploit (utilize) physical impurities, which are different for each device, to make exact physical copies impossible to manufacture. In practice this is often used to verify, that a particular hardware (a chip) is not counterfeit. This is usually implemented with a challenge and response protocol. A vendor can collect valid responses for random challenges of a chip, and the customer can verify later, that the device bought, was really made by that manufacturer.
While creating an exact copy of the chip might be impossible, one could try to understand the mathematical model underlaying the behavior and therefor is able to create a device that emulates the behavior of the original chip. Will every PUF have this flaw, that math can describe it's behavior, or are there PUFs that are truly random and thus unpredictable? - that is an unsolved question.
With the experiments I conducted, we tried to understand a certain PUF family better, whose underlaying mathematical model is unknown. But Fatemeh Ganji and Shahin Tajik were able, with the data I collected, to construct a machine learning algorithm that can learn the behavior of this PUF family.
Bistable Ring PUF
The Bistable Ring PUF (short: BR-PUF) exploits the behavior of inverters in a ring configuration. A digital inverter could for example output 0V if the input was 5V, and output 5V if the input was 0V. Connecting the output of a digital inverter back to it's input will result in an oscillator, which constantly tries to correct the output based on the new input. Connecting two inverters, like shown in the picture below, should result in a stable configuration.
But theoretical it's not possible predict if the the right wire will outputting a logical
1, or if the left wire will output a
1. While this is unpredictable in theory, in practice manufacturing can cause a device to always show the same result when powered on.
This fact can then be used in a configuration like shown below to create a unique challenge and response behavior.
The BR-PUF contains an even number of stages.
Each stage contains two inverters in parallel. Only one is used at a time. Which one depends on the challenge input bit
c[i]. for example if the 1st challenge bit is set
c=1, then the first inverter in the first stage would be used. If it's not set
c=0, then the second inverter would be used.
So based on the challenge bit string
011..110), a unique inverter ring can be configured. Ideally when a single challenge bit is changed, the outcome should be unpredictable. Similar to a cryptographic hash function - change one bit in the input, observe a basically random change in the output.
The question is, can a mathematical model describe the behavior, or can the behavior be emulated by a machine learning algorithm?
To evaluate a big number of BR-PUFs, we used an FPGA-based implementation. This way we can automatically generate new PUF implementations, program the board, run our tests, and continue with the next one. This way we can gather huge amounts of data, for a lot of different PUF configurations.
For the experiments we used a DE0-Nano FPGA Board with an ALTERA Cyclone IV. The following setup was used to perform the tests:
Windows Host (1) was connected via USB to the FPGA (3), in order to program new PUF configurations via Quartus Programmer (Quartus II 15.0 64-bit Web Edition). After successfully programming the FPGA, it would notify the OSX Host (2) via an Ethernet connection.
The OSX Host (2) implemented a simple server which is waiting for the signal that the FPGA is ready. A script would then start which runs a large amount of challenges via the UART connection and collects the responses. Once all challenges were executed, the Server would respond back to the Windows Host (1), which then can generate another PUF configuration, and start the whole cycle over.
Based on the collected challenges some basic analysis were performed and saved in a github repository for further analysis.
Automating PUF generation with Quartus
The first big challenge was to figure out how we could automate programming the FPGA with Quartus. This was not straightforward, because we didn't want to generate verilog code and then recompile it, we wanted to control where the PUF stages are placed inside of the chip. Basically control which logic-cells will be used. This can be achieved using the GUI Assignment Editor. To figure out how this can be done without the GUI, I have observed what kind of files are generated and what kind of programs Quartus invokes upon compilation. This way I learned that the configurations from the Assignment Editor, which can be used to choose the physical location on the FPGA for certain pieces, are written into a Quartus settings file
An example assignment configuration in the GUI could look like this:
Which corresponds to the following lines in the
set_location_assignment FF_X30_Y1_N5 -to "puf_c" set_location_assignment LCCOMB_X30_Y1_N4 -to "br_puf:pufi|br_stage:br_stage_0|x" set_location_assignment LCCOMB_X30_Y1_N6 -to "br_puf:pufi|br_stage:br_stage_0|x" set_location_assignment LCCOMB_X30_Y1_N20 -to "br_puf:pufi|br_stage:br_stage_0|y" set_location_assignment LCCOMB_X30_Y1_N22 -to "br_puf:pufi|br_stage:br_stage_0|y" set_location_assignment LCCOMB_X30_Y1_N30 -to "br_puf:pufi|br_stage_io" set_location_assignment FF_X30_Y2_N5 -to "puf_c" set_location_assignment LCCOMB_X30_Y2_N4 -to "br_puf:pufi|br_stage:br_stage_1|x" set_location_assignment LCCOMB_X30_Y2_N6 -to "br_puf:pufi|br_stage:br_stage_1|x" set_location_assignment LCCOMB_X30_Y2_N20 -to "br_puf:pufi|br_stage:br_stage_1|y" set_location_assignment LCCOMB_X30_Y2_N22 -to "br_puf:pufi|br_stage:br_stage_1|y" set_location_assignment LCCOMB_X30_Y2_N30 -to "br_puf:pufi|br_stage_io" set_location_assignment FF_X30_Y3_N5 -to "puf_c"
With the generated
.qsf file, we can then compile this specific PUF configuration and program the FPGA:
quartus_map --read_settings_files=on --write_settings_files=off puf_top -c puf_top quartus_fit --read_settings_files=off --write_settings_files=off puf_top -c puf_top quartus_asm --read_settings_files=off --write_settings_files=off puf_top -c puf_top quartus_cdb puf_top -c puf_top --write_eqn_file --netlist_type=cmp quartus_sta puf_top -c puf_top quartus_asm puf_top quartus_pgm -c usb-blaster chain_description.cdf
This setup can be used to automatically configure the PUF in various places on the FPGA. For example in a row or column of logic-cells, and then move this line around - first implement a PUF in column 1/2, then in 2/3 and so forth.
Visualizing the FPGA usage
In the Quartus Chip Planber we can see where our FPGA configuration will be placed. This is great to visualize the kind of PUF we have configured. Notice the two colored straight columns, those are the 32 BR-Puf stages - the ring configuration.
To verify that the PUF was really configured how we wanted to, and also to understand the data we collect better, I visualized the FGPA configuration after compilation. Quartus generates an output file
puf_top.fit.eqn, which contains the information which component gets placed where. I wrote a script to parse this file and generate an image like this one:
The implementation is horrible but "works for me". I generated an
.html file with a huge
<div> grid and colored it accordingly with CSS, then rendered it with the
selenium webdriver to get the image.
Like mentioned above, we used UART to set a challenge and then read the response. We used several different ways to generate challenges throughout the research. For most of it we used randomly generated challenges. But for example for small PUFs with only 8 or 16 bit, we could run all possible challenges.
We also looked at the impact of a single bit change in a challenge. For example take a random challenge, then always look at the response when only one bit is changed. This way we can learn if a single bit change truly changes the output randomly, or if a single bit has no great effect in a particular configuration.
Very quickly it's clear that certain challenges produce unstable outputs. This means if you apply the same challenge to the PUF (selecting always the same path of inverters) the result might be a
0 or a
1. So we ran every challenge multiple times and stored each response. This way we can see which challenges are very stable, and always have the same response, or which ones are unstable and switch sometimes.
For the collected data I then created a report like the one below. In this case you can see several different PUF configurations. This table shows simple information like the amount of challenges, and what the ratio between
1 responses was. For example the PUF implemented in columns 16/17 (
LAB1_X16_X17) has a fairly balanced output. While the PUF in columns 23/24 (
LAB1_X23_X24) is extremely biased to return a 0.
For each PUF I generated a more detailed report. This includes the PUF configuration, simple pie charts, but also more comlex diagrams. For example visualizing in how many challenges that returned a
0 a bit was set to
1. This way you could identify "influential bits" in a challenge, which have a huge effect on the result.
Analyzing the data and interesting observations
When I started with my experiments and got the first data, we identified problems, weird behaviors and other things that changed the strategy and setup throughout the research. Here are some examples:
Most of the PUF configurations we tested turned out to be hugely biased. Meaning they will most of the time return a
1. Which is quite bad if you want to implement a strong PUF with unpredictable responses. If a PUF returns mostly
1 for all challenges, it's not hard to guess the response for different challenge. A strong PUF would basically have an unbiased 50:50 outcome with a big number of random challenges. Because of this, we focused most analysis on PUFs that we considered to be quite strong - meaning their response can not be predicted with high accuracy based on their bias.
Unstable vs. stable challenges
Another variable we wanted to control is, at what point of time do we read the response from the inverter ring. At the beginning the state of the PUF was read when the response was requested via UART. But at some point I implemented a counter setting. This counter would starts at 0 when turned on, and once it reaches a configured number, it would read the current state of the PUF and save it. Which can then be extracted via UART at some later point in time.
This way we were able to answer questions such as, if a BR-PUF gets more stable over time. And make sure that a collected dataset has more controlled variables.
Here is an example stable challenge with a counter. Yellow is the output of the PUF. Blue indicates when the counter is done and the current logical value of the PUF (yellow) is stored.
(1) The PUF is turned on. The inverter ring starts to power on and starts to oscillate slightly, but it fairly quickly moves towards a stable logical output of
(2) The counter, which started at 0 when the device turned on, reached the configured value. At this point in time the logical value of the PUF (yellow) is stored as a response for this particular challenge.
(3) Then we can see how the PUF output finally stabilizes on the logical
1, a little bit after the response was already saved.
The next image will show an unstable challenge.
(1) The PUF is turned on. The inverter ring starts oscillating heavily between logical
(2) The counter reached the configured value and reads the current state of the PUF (yellow). Because of the heavily oscillating PUF, the response is either a
1 by chance.
(3) It's extremely interesting that even after a longer period of time, the PUF never becomes stable. We would expect that an even number of inverters would be stable at some point, but this is apparently not the case.
Results influence each-other
While testing, I noticed that some challenges are heavily influenced by previous results. So for example if challenge X returned a
1 and is followed by challenge Y, then challenge Y also returns a
1. But if Z returned a
0 and is followed by a Y, then challenge Y returns a
0. This was quite a big shock, but glad we caught it.
A BR-PUF implemented in an FPGA is not a "perfect" implementation - in the sense that logic-cells are configured with a lookup table to make it behave like an inverter. Here is a picture from Quartus, showing how one particular inverter cell is connected.
Not every possible input into the logical cell is connected. And this causes some analog electrical circuit magic interference that I don't quite understand. We connected these inputs in a specific way and have not observed challenges that influence each other anymore. Unfortunately I haven't figured out how to fix the PUF configurations automatically after creating a new configuration, so I had to do the post fitting assignment by hand.
Stable but oscillating challenges
Another interesting observation I made while looking at the PUF with an oscilloscope were challenges that were basically stable, but not really. What I mean by that is, that certain challenges cause the ring of inverters to oscillate not in chaos, but create a spike that travels around the ring with a certain frequency.
The yellow line shows again the state of the PUF, while the blue one indicates when a response was stored. The interesting part here is that the yellow line is logical
0 for most of the time, but once in a while shows a peak - a wave is traveling around the inverter ring. So even if we humans can read this as generally a stable response
0, the PUF could by accident read the logical state right at the point of a spike, and read a
This traveling wave is especially interesting when looking at different points in time. The top trace is captured when the counter reached
0x1ff, so a fairly short amount of time, and the bottom trace shows the state of the PUF after the counter has counted to
Applying the same challenge over and over again shows that the spikes appear at basically the same place every time with a short counter. But when more time passes (bottom trace), then spikes appear in more chaotic places. Which indicates that the waves don't travel with the same frequency every time, but slowly drift.
While I took part in many meetings where we discussed results and weird behaviors, most of the real analysis and interpretation of data was done by Fatemeh Ganji and Shahin Tajik. So I suggest you to read the paper: Strong Machine Learning Attack against PUFs with No Mathematical Model.