Leveraging Aging Effect to Improve SRAM-based True Random Number Generators

Saman Kiamehr  Mohammad Saber Golanbari  Mehdi B. Tahoori
Karlsruhe Institute of Technology, Karlsruhe, Germany
e-mails: \{kiamehr, mohammad.golanbari, mehdi.tahoori\}@kit.edu

Abstract—The start-up value of SRAM cells can be used as the random number vector or a seed for the generation of a pseudo random number. However, the randomness of the generated number is pretty low since many of the cells are largely skewed due to process variation and their start-up value leans toward zero or one. In this paper, we propose an approach to increase the randomness of SRAM-based True Random Number Generators (TRNGs) by leveraging transistor aging impact. The idea is to iteratively power-up the SRAM cells and put them under accelerated aging to make the cells less skewed and hence obtaining a more random vector. The simulation results show that the min-entropy of SRAM-based TRNG increases by 10X with this approach.

I. INTRODUCTION

True Random Number Generator (TRNG) is a key element of cryptographic systems providing secret keys and tokens. For ultra-low power secure applications, e.g. Internet of Things (IoT), designing low power and reliable TRNG is very crucial. There are different types of TRNGs which are based on non-deterministic physical randomness in circuits [1]. One common way of TRNG implementation is based on SRAM cells. In the hardware security context, SRAM-cells can be used as Physical Unclonable Function (PUF). A PUF is normally used for identification or authentication of an IC by generating a unique fingerprint [2–5].

However, there are some cells in the SRAM array which are less skewed and therefore, the start-up values of these cells exhibit more randomness dependent on the noise of each power-up. These random power-up bits can be used as a source of random number generation [6]. However, only a small portion of SRAM cells show the noisy behaviour and therefore the entropy of the SRAM power-up pattern has to be condensed into a full entropy random seed [1]. This is done by some compression approaches such as using a hash function. It is shown that in order to have 256 bits with full entropy, the size of the SRAM array should be at least 1600 bytes (50 times larger) [1].

In this paper, we propose an approach to improve the entropy of SRAM-based TRNG by leveraging the transistor aging impact. The idea behind our method is the fact that by powering up a skewed SRAM cell, and applying elevated stress (for accelerated aging) under this “preferred” state, its skewness reduces. Based on this fact, we propose an approach in which the SRAM-based TRNG is powered up iteratively and at each iteration the chip is aged at a high voltage and temperature (accelerated aging) during the burn-in phase in the in order to make SRAM cells less skewed and in turn increase randomness. Simulation results show that using our approach, the min-entropy of the SRAM-based TRNG can be improved by 10X. For the same min-entropy, this can be translated to a significant improvement (~10X) in the area and power overhead of TRNG which makes it very practical for IoT applications.

II. MOTIVATION

The simplest structure of a SRAM cell is the 6T structure consisting of two back-to-back inverters and two access transistors (see Fig 1(a)). If there is no variation, the SRAM cell is non-skewed and the back-to-back inverters structure is symmetrical. By powering up the SRAM cell, due to its non-skewed symmetrical structure, it randomly gets a value of either “one” or “zero” dependent on noise, e.g. thermal noise. Here, we define probability of powering-up to “one” (PO1) or “zero” (PO0), showing the skewness of the cell:

\[
PO_i = \frac{\#of(\text{Startup value} = \text{"one"})}{\text{Total number of trials}} = 1 - PO_{\bar{i}} \quad (1)
\]

According to this definition, if the SRAM cell is not skewed, the PO1 or PO0 are close to 0.5. However, in the presence of process variation, some of the transistors might become stronger and the SRAM might become asymmetrical which we call here skewed. By powering up the skewed SRAM cell, the probability of settling at “one” could be higher or lower than “zero” according to the strength of the internal transistors impacted by process variation. Fig. 1(b), shows the PO1 for a SRAM array with 256 bits. As can be seen in this figure, most of the cells are skewed such that their start-up values are mostly “one” or “zero”, showing a low randomness.

Here, for the sake of simplicity, we explain the impact of process variation for a case in which only the PMOS transistor MP1 in Fig 1(a) is affected and its \(\Delta V_{th}\) shift from the nominal value is shown in Fig. 2(a). If due to process variation this transistor becomes stronger (weaker), this means that the probability that node “Q” becomes equal to “one” is higher (lower) (see Fig 2(b)). When MP1 is strong enough (large negative \(\Delta V_{th}\)), the start-up state of Q and \(Q\) are always “one” and “zero”, respectively. In this case, if the SRAM cell stays ON under this power-up state, the transistor MP1 is under NBTI stress and therefore it becomes weaker, which means the SRAM cell becomes less skewed. However, if we continue aging the cell, MP1 might become that weak (even weaker than MP2) leading to a case in which the start-up state of Q and \(Q\) flips.

For the case in which MP1 is not skewed, the start-up value is random and if the cell is aged according to this value, the transistor MP1 might become more skewed. Although the above explanation is for the case in which the process variation and NBTI impacts only on the MP1 transistor are considered, the observation is valid for all transistors considering also PBTI effect. In general from the above observation we can conclude:

Fig. 1: a) Structure of 6T SRAM cell b) Probability of start-up value being “one” (PO1) for 256 bits SRAM-based TRNG
III. PROPOSED APPROACH

Here, we propose an approach to improve the randomness of TRNGs by leveraging the BTI effect. The overall flow of our approach is shown in Fig. 3. In this approach, the SRAM array is powered-up iteratively and the cells are aged in an accelerated manner for each iteration according to their start-up value. At each iteration, the power-up value of a non-skewed cell is a random number of “zero” or “one” and if we age the cell iteratively, in almost half of the iterations the cell is aged according to the value “zero” and in the other half of the iterations the cell is aged according to the value “one” leading to a symmetrical BTI aging and therefore the cell remains non-skewed. However, in case of a skewed cell, the iterative aging of the cell by its power-up value makes it less skewed. For example, if the cell is skewed to-power-up to “one” (the left side of Fig 2(a)) with a large negative $\Delta V\text{th}$ with a probability of $POP_0 = 0.9$, in 9 iterations out of 10, it ages in a way that the $|\Delta V\text{th}|$ becomes smaller (shift to right and closer to zero in Fig 2(a)) which makes the cell less skewed.

The accelerated aging can be performed during the burn-in phase of chip production by increasing the temperature and the supply voltage. For simplification, we assume a skewness value for each memory cell instead of considering the SRAM cell. For example, if the cell is its power-up value makes it less skewed. For example, if the cell is

Aging a non-skewed cell with its start-up state might lead to more

Aging a skewed cell with its start-up state leads to a less skewed cell which is good for TRNG. However, over-aging the cell with the same state might make the cell skewed again in the other direction.

• Aiming a skewed cell with its start-up state might make the cell skewed again in the other direction.

Fig 4 illustrates the Markov chain of the proposed iterative approach with the aforementioned assumptions. The nodes of the graph are the states and the edges are the probabilities to switch from one state to another. The elements of the transition matrix can be extracted from $POP_0$ and $POP_1$ plots in Fig 4. For example, the probability to fall to state -1 and 1 are equal when the previous state is zero ($\Delta V\text{th} = 0$):

$$p_{x,0} = P\{X_{i+1} = y \mid X_i = x\} = P\{X_1 = y \mid X_0 = x\}. \quad (2)$$

This is because at state 0 the SRAM cell is symmetric (no skew). However, when the state is 1 or equally $\Delta V\text{th} = 1\text{mV}$ the probability to fall to state 0 and 2 after aging are 0.8 and 0.2, due to the positive skewness of the SRAM cell.

$$p_{1,0} = 0.8 \quad p_{1,2} = 0.2$$

Therefore, the transition matrix is built accordingly:

$$P = \begin{bmatrix}
p_{-60, -60} & p_{-60, -59} & \cdots & p_{-60, 59} & p_{-60, 60} \\
p_{0, -60} & p_{0, -59} & \cdots & p_{0, 59} & p_{0, 60} \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
p_{60, -60} & p_{60, -59} & \cdots & p_{60, 59} & p_{60, 60} \\
\end{bmatrix} \quad (3)$$

In fact, the superdiagonal and subdiagonal entries of the transition matrix are $POP_1$ and $POP_0$ probabilities (Fig. 2(b), respectively).

Assuming that the probabilities of the states at iteration $n$ is $f_n$ which is a vector of length $|\Omega|$, the probabilities of the states at iteration $n + 1$ would be $f_{n+1} = f_n P$, in other words $f_n = f_0 P^n$.

According to the fundamental theorem of Markov chains [8], there is a unique probability vector $\pi$ called stationary distribution which has the attribute:

$$\pi = \pi P. \quad (4)$$

After a number of iterations the Markov chain converges to the stationary distribution $\pi$, i.e. additional iterations have no impact on the distribution. It is evident from Equation (4) that the resulting distribution $\pi$ is solely dependent on the transition matrix. Intuitively, we can say that the initial distribution is limited by the $POP_1$ and $POP_0$ distributions in each iteration, and after a number of

Fig. 4: Simple Markov chain structure of the proposed iterative approach with skewness $x \in \Omega = \{-60, \ldots, 60\}$. 
iterations the final distribution would span where both \( POP_1 \) and \( POP_2 \) distributions are non-zero.

Therefore, \( \pi \) should have much smaller standard deviation compared to the original distribution of states leading to narrower distribution of the threshold voltage of the transistors after applying the proposed iterative approach. This is shown in Fig. 5(a), where after \( N \) iterations \( \Delta V_{th} \) has a narrower distribution compared to the original distribution which in turn leads to the cells with lower skewness, i.e. \( POP \) values closer to 0.5 (as shown in Fig 5(b)).

IV. SIMULATION RESULTS

A. Simulation setup and flow

1) TRNG setup: We assume that our SRAM-based TRNG consists of 256 SRAM cells (256 bits). We obtain the start-up pattern of the SRAM array for 100 trials and then we compare the original 256 bit SRAM-based TRNG is approximately equal to 5% which is consistent with the results provided for real SRAM-based TRNGs [1].

2) SRAM simulation flow: For each SRAM cell, we consider a 6T structure as shown in Fig. 1(a). Process variation is considered as a shift in the threshold voltage \( \Delta V_{th} \) values of the internal transistors for all 256 cells according to the Pelgrim model [10]. For each cell, the start-up value is obtained using HSPICE simulation with 32nm SAED SPICE model. For this purpose, the word-line node \( WL \) is connected to ground and the nodes \( Q \) and \( \bar{Q} \) are initialized with “zero” value and then a transient simulation is performed to obtain the value of \( Q \) and \( \bar{Q} \) after a given time. To consider the noise, a voltage source is connected to node \( Q \) with a value of \( +/ - V_{noise} \) (see Fig. 1(a)). The sign of this voltage source is changed randomly for each trial. The value of \( V_{noise} \) is set in a way that the min-entropy of the original 256 bit SRAM-based TRNG is approximately equal to 5%

3) Accelerated BTI aging: As shown in Fig. 3, at each iteration of our proposed approach, the SRAM cell is aged under BTI stress in an accelerate manner according to its start-up value in order to decrease the skewness of the cell. The accelerated BTI aging can be performed by increasing the supply voltage and the temperature during BTI stress [11]. If the start-up value of \( Q \) is “zero” ("one"), the MP2 (MP1) and MN1 (MSN2) transistors are under NBTI and PBTI aging stress since these two transistors are ON. For this set of simulation, we model the BTI impact by a threshold voltage shift of transistors and we obtain the results for three different scenarios for accelerated aging as explained in the following and for each scenario the number of iterations is 100.

- Uniform BTI aging: In this scenario, we age the SRAM cells in a way that the \( \Delta V_{th} \) of the transistors under stress is equal at each iteration. The amount of aging (\( \Delta V_{th} \)) should be carefully selected for this approach because it causes a trade-off between the efficiency of the approach and the required number of iterations. More BTI aging at each iteration (larger \( \Delta V_{th} \) step) causes the strongly skewed cells to become less skewed faster, i.e., less number of iterations is needed, however it will lead to less efficiency of our approach at the end in terms of the randomness of TRNG. To explain this, please consider the Fig. 2(a). At each iteration, \( \Delta V_{th} \) shifts to the right or the left. The choice of a very large \( \Delta V_{th} \) step, e.g. 30mV, ages a skewed cell in a way that it becomes skewed in the other direction. A choice of smaller \( \Delta V_{th} \) steps, causes more cells to become non-skewed by enough number of iterations. For this set of simulation, we picked two values of 1mV and 0.5mV for the BTI aging-induced \( \Delta V_{th} \) at each iteration.

- Non-uniform BTI aging: In order to obtain a good trade-off between the efficiency of the approach (the obtained randomness at the end) and the required number of iterations, we also propose a non-uniform BTI aging scenario. In these scenario, the SRAM cells are aged more in the beginning (the earlier iterations) to improve the strong skewed cells faster, i.e. better efficiency is obtained with less number of iterations. Then, at each iteration, we reduce the amount of BTI aging in order to reach a better randomness by fine-tuning the final \( V_{th} \) of transistors at the end of all iterations. Here we chose a Non-uniform harmonic BTI aging in which a harmonic series for BTI-induced \( \Delta V_{th} \) at each iteration is used:

\[
\Delta V_{th_i} = \frac{12mV}{i + 1} \]

\[i = 1, 2, \ldots, \text{total number of iterations}\]

\[V_{th_0} \text{ for non-uniform scenario is chosen according to the standard deviation of process variation-induced Vth shift (}\sigma_{\Delta V_{th}}).\]

B. Min-entropy over iterations

Fig. 6(a) shows the min-entropy of 256-bits SRAM-based TRNG over iterations using different iterative aging scenarios (introduce in Section IV-A3). As shown in Fig. 6(a), the entropy under all these scenarios improves significantly over iterations. In case of uniform BTI aging, the choice of lower aging rate per iteration (\( \Delta V_{th} = 0.5mV \)) results in a slightly better min-entropy at the end of iterations, however the choice of larger aging rate (\( \Delta V_{th} = 1.0mV \)) converges faster to better min-entropy ranges.

Compared to the uniform aging scenarios, the choice of non-uniform aging rate per iteration results not only in better final min-entropy values, but also it leads to faster convergence. This is due to the fact that in case of non-uniform aging rate, in the first iterations the BTI aging and hence the induced \( \Delta V_{th} \) is large leading to faster
conversion improvement of strongly skewed cells. Over the iterations, the amount of BTI-induced $\Delta V_{th}$ is reduced leading to a more fine tune improvement in the skewness of the SRAM cells.

According to the results of Fig. 6(a), the choice of non-uniform harmonic BTI aging provides the best trade-off between the obtained randomness and the number of required iterations. In general, the proposed approach can increase the min-entropy from 5% to more than 55% which means around 10X improvement in the min-entropy value. This is translated to a huge area and power saving because the same amount of min-entropy can be obtained with a much smaller TRNG size (10 times smaller).

C. POP results

Fig. 7 depicts the probability of powering-up to “one” (POP$_1$) of 256 bits TRNG for different iterations using proposed approach for non-uniform harmonic BTI aging scenario (as the best choice according to last sub-section). According to the figure, before applying the proposed approach, the POP$_1$ value of most of the cells is either 0 or 1, meaning that the start-up value is mostly “zero” or “one” for trials of powering-up showing that the number generated by TRNG is not a random value. However, by applying our approach, POP$_1$ of most of the cells become close to 0.5 meaning that in half of the trials the start-up value is “zero” and in the other half it is “one” which is essential for a desirable TRNG.

D. Accelerated aging approach

As shown previously, the choice of non-uniform BTI aging provides the best trade-off between the efficiency of the approach and the number of required iterations. In this sub-section we explain the process of accelerated BTI aging for each iteration. According to the literature, the BTI aging-induced $\Delta V_{th}$ is a function of different factors such as temperature and supply voltage. Here we assume that the delay of a digital circuit will degrade around 15% in three years in room temperature and nominal supply voltage ($V_{dd}=1.0V$). This could be translated to $\Delta V_{th} = 20mV$ in 32nm technology node using HSPICE simulation.

As discussed in Section IV-A3, the amount of $\Delta V_{th}$ required for the first iteration is around $\Delta V_{th} = 12mV$ which can be translated to 350 days of BTI aging in room temperature and nominal supply voltage using equations provided in [12]. However, the BTI aging can be accelerated using higher supply voltages and temperatures [11]. If we increase the temperature to 125 °C and supply voltage to 1.8V, it takes only 25 minutes for the first iteration.

For the next iterations, even if we stress the cells under the same supply voltage, temperature and stress time, the amount of $\Delta V_{th}$ is decreased since the BTI induced degradation has a logarithmic dependency with the time (see Fig. 6(b)). This means that the amount of BTI aging-induced $\Delta V_{th}$ decreases over iterations even under the same condition and this is exactly what we need in the non-uniform scenario. For the stress conditions of $Temp = 125^\circ C$, $V_{dd} = 1.8V$ and Stress time = 25min, the $\Delta V_{th}$ of first, second and third iterations are equal to 12mV, 2.27mV, and 1.52mV, respectively satisfying the condition of our non-uniform scenarios. According to the results depicted in Fig. 6(a) and Fig. 7, using a non-uniform scenario, 50 iterations are enough to obtain a desirable TRNG. This means that the entire process of our approach takes around $25min \cdot 50 = 20hours$ during the burn-in phase. Of course, the numbers provided here are just examples to show the feasibility of our proposed approach, however, the BTI aging conditions (temperature, supply voltage and stress time) at each iteration could be adapted according to the technology.

It should be noted that, by performing accelerated BTI for the entire memory block during the burn-in phase not only the efficiency of the TRNG part is improving, but also the yield and Signal Noise Margin (SNM) of the other parts of the memory array, not used for TRNG, can be improved [13].

V. Conclusion

In this paper, we propose an approach in which Bias Temperature Instability (BTI) is used to improve the randomness of SRAM-based TRNGs. The idea behind the proposed approach is to iteratively power-up the SRAM cells and age them in an accelerated manner to make the cells less skewed. Simulation results show that the min-entropy of SRAM-based TRNGs could be improved by 10X using our proposed approach.

REFERENCES