# **Transistor-Specific Delay Modeling for SSTA**

Brian Cline, Kaviraj Chopra, David Blaauw, Andres Torres<sup>†</sup>, Savithri Sundareswaran<sup>‡</sup> {btcline,kaviraj,blaauw@umich.edu} <sup>†</sup>{andres\_torres@mentor.com} <sup>‡</sup>{Savithri.Sundareswaran@freescale.com}

# Abstract

SSTA has received a considerable amount of attention in recent years. However, it is a general rule that any approach can only be as accurate as the underlying models. Thus, variation models are an important research topic, in addition to the development of statistical timing tools. These models attempt to predict fluctuations in parameters like doping concentration, critical dimension (CD), and ILD thickness, as well as their spatial correlations. Modeling CD variation is a difficult problem because it contains a systematic component that is context dependent as well as a probabilistic component that is caused by exposure and defocus variation. Since these variations are dependent on topology, modern-day designs can potentially contain thousands of unique CD distributions. To capture all of the individual CD distributions within statistical timing, a transistor-specific model is required. However, statistical CD models used in industry today do not distinguish between transistors contained within different standard cell types (at the same location in a die), nor do they distinguish between transistors contained within the same standard cell. In this work we verify that the current methodology is error-prone using a 90nm industrial library and lithography recipe (with industrial OPC) and propose a new SSTA delay model that on average reduces error of standard deviation from 11.8% to 4.1% when the total variation ( $\sigma/\mu$ ) is 4.9% – a 2.9X reduction. Our model is compatible with existing SSTA techniques and can easily incorporate other sources of variation such as random dopant fluctuation and line-edge roughness.

## **1. INTRODUCTION**

In modern-day Integrated Circuit (IC) design, process parameter variations are becoming an increasing concern. As we scale below 90nm, manufacturing effects such as lithography exposure and defocus variation, random dopant fluctuation (RDF), lineedge roughness (LER), and dishing are causing increasing amounts of variation in every aspect of design. Considering that the sources of these variations are random processes, a probabilistic methodology could potentially be more accurate than its deterministic counterpart. For this reason, researchers have been exploring statistical approaches in various areas of modern IC design for years. Of these approaches, Statistical Static Timing Analysis (SSTA) is one of the most prominent and has attracted a significant amount of attention over the past decade. SSTA itself is a broad research topic that encompasses everything from the timing analysis algorithm (which includes atomic delay operations like SUM and MAX) to the underlying models that strive to capture IC variation.

While there has been a great deal of work devoted to the SSTA algorithm [1-5], to our knowledge little improvement has been made in the delay models used within SSTA. This poses a potential problem, since the overall SSTA accuracy is fundamentally limited by the accuracy of the underlying models. Without sufficient accuracy, the benefits of switching from deterministic timing to SSTA are uncertain. Of the three main variation parameters -Critical Dimension (CD), doping concentration, and Inter-layer dielectric (ILD) thickness – CD variation modeling is particularly difficult because it contains both a systematic component that is context dependent, as well as a probabilistic component that is caused by exposure and defocus variation in the lithography system. These variations in exposure and defocus create unique, transistor-specific distributions. Current SSTA frameworks, however, do not model these differences in device distributions. Instead, CD variation is handled identically across the entire standard cell library. This type of CD model is error-prone for two reasons:



Figure 1. Standard Cell Layout - Poly & Diffusion Layers Only

- The model assumes that a single CD distribution applies to all standard cells in the library, regardless of cell type.
- The model assumes that the same, single CD distribution applies to all transistors within a standard cell.

These two assumptions lead to errors in SSTA because the resulting model does not account for the fact that different transistors (at the same location in a die) can have different CD distributions. For instance, Figure 1 contains a sample standard cell layout (the drawn and printed image polysilicon, as well as the diffusion layers are shown) with 12 transistors. The current CD model assumes that all 12 transistors vary identically, which means that changes in CD, or  $\Delta CD$ , for each transistor can be represented by the same random variable (RV). However, in reality, each transistor CD is dependent on its neighboring geometries; the distance from neighboring gates, the distance to poly-to-contact landings (shown in Figure 1.B), and the line-end overhang (shown in Figure 1.A) will all affect an individual CD distribution. These layout characteristics not only modify the nominal CD for each device, but they also impact the variability of CD and its sensitivity to changes in lithography exposure and defocus. Thus, capturing  $\Delta CD$  with a single RV is inaccurate. However, modeling each transistor CD as an independent RV is also incorrect, since exposure, defocus, and context similarities lead to similarities (and correlation) between CD distributions. Therefore, to accurately represent CD in a design, we would prefer a separate RV for each transistor that would not only contain the moments ( $\mu$ ,  $\sigma$ , etc.) of its actual CD distribution, but would also preserve its correlation to other transistors.

To verify the impact of topology on both nominal CD and CD sensitivity to changes in exposure and defocus, Figure 2 is included, which plots  $CD_i$  (for some transistor, *i*, in the standard cell from Figure 1) as a function of lithography exposure. In Figure 2, four of the twelve  $CD_i$ 's (T1, T2, T6, and T9) are shown. When the actual distribution of exposure is input into the  $CD_i$ function, the resulting CD distribution for transistor *i* has a unique mean and standard deviation, but is highly correlated to the other 11 distributions. The average CD (at each exposure setting) for the cell is also plotted and represents the single distribution CD model. Even though this is a simple example (the only transistors used to compute the average CD come from one standard cell and the only variation included is the lithography exposure variation), the single CD model still incurs an average error in standard deviation ( $\sigma$ ) of ~9% when total variation ( $\sigma/\mu$ ) is ~4%. The zoomed in portion of Figure 2 emphasizes the difference in nominal CD for the transistors in the cell, as well as the difference in sensitivity (the difference in curvature) to changes in exposure.



Figure 2. Standard Cell Gate CD vs. Exposure

While there has been a significant amount of research on developing new lithography-aware characterization tools and determining how lithography impacts physical and electrical device parameters [6-8], to our knowledge no one has proposed an accurate, Transistor-Specific SSTA delay model. In [6], the authors developed a lithography simulation flow which they used to improve case-based analysis over current timing. While they showed improvement over static timing analysis (STA), it was not clear how their characterization could be extended to SSTA. An improved gate length extraction was proposed in [7] and used to improve timing accuracy in non-uniform device gates. Choi et al. in [8] designed a tool aimed at incorporating numerous sources of variation, such as proximity effects, lens aberrations, and Chemical-Mechanical Polishing (CMP). However, all the previous approaches have focused on improving STA, and are therefore applicable in the deterministic sense.

In this work we propose a novel CD and delay model for SSTA that captures the cell-level and transistor-level lithography effects by incorporating all of the systematic and probabilistic components due to exposure and defocus variation. Using a 90nm industrial library and lithography recipe (with industrial OPC), we verify our model against results obtained from a custom lithography-aware simulator. Then we compare the amount of error in our model to the amount of error in the current SSTA delay model. We found that by using our CD and delay model, you can achieve a ~3X reduction in the error of standard deviation. Our approach uses Principal Component Analysis (PCA) to minimize the number of components used in our CD model so that we can effectively capture any CD distribution in our library with only 2 components. This CD model is then used in our proposed SSTA delay model. By utilizing PCA to reduce the number of components, the characterization runtime is on the same order as the current technique. Additionally, our model is compatible with existing SSTA frameworks [1,2] and we can easily incorporate other variation sources such as RDF and LER.

The remainder of this paper is organized as follows: Section 2 discusses the previous approach in more detail, while Section 3 describes our proposed model. Then Section 4 illustrates the results obtained in our standard cell characterization and delay model generation, and Section 5 concludes the paper.

# **2. PREVIOUS APPROACH**

Current SSTA methodologies perform all statistical operations on propagation delays in order to determine the final distribution for timing [1,2]. However, the propagation delay for a single gate is actually a function of a number of parameters that are affected by variation (e.g. gate length and threshold voltage). In this work, we focus on gate length variation. It is well known that propagation delay can be modeled as a linear or quadratic function of gate length, as shown in (1) and (2), respectively. These models typically provide a simple, but accurate, representation of delay in terms of gate length. From the models in (1) and (2), only  $\alpha$ ,  $\beta$  (and  $\lambda$ ), and the distribution for  $L_g$  are needed to calculate the delay variation.

$$Delay = \alpha + \beta L_{g} \tag{1}$$

$$Delay = \alpha + \beta L_{\sigma} + \lambda L_{\sigma}^{2}$$
<sup>(2)</sup>

In this work we chose to model delay as a quadratic function of gate length, as in (2), since quadratic models are capable of capturing some nonlinearity. Therefore, the delay models mentioned in the remainder of the paper are quadratic.

While (2) seems simplistic at first glance, its actual implementation within timing analysis (TA) is slightly more complicated, thus, a brief description of present-day delay modeling and CD modeling follows.

# 2.1. Delay Model

Equation (2) is a straightforward representation of the dependence of *Delay* on one input parameter,  $L_g$ . However, in reality delay is also dependent on the output loading of the gate and the slope or slew rate of the input signal. Additionally, a gate usually has more than one input-pin, and the time it takes for an input transition to propagate to the output can vary from input-pin to input-pin. Present-day timing analysis is able to manage these dependencies by utilizing data in the form of a lookup table. This lookup table is typically built during library characterization in the early stages of a standard cell library's lifetime. For every combination of output load and input slew, the characterization tool fits the input-to-output propagation delays as a function of gate length. Thus, for some gate in the library that has P input pins and S output-load/input-slew pairs, there will be  $2 \times P \times S$  values of each coefficient,  $\alpha$ ,  $\beta$ , and  $\lambda$  (the factor of two appears because there is a rising and falling transition associated with each pin). Example pseudo-code for delay model characterization is included in Figure 3. The characterization flow is also illustrated in Figure 4.



Figure 3. Delay Model Characterization Pseudo-code



Figure 4. Delay Model Characterization

Table 1. Percentage Deviation from Max CD (Nominal Exposure & Defocus)

|    | % Deviation from |     | % Deviation from |  |
|----|------------------|-----|------------------|--|
|    | Max CD (T1)      |     | Max CD (T1)      |  |
| T1 | 0%               | T7  | 0%               |  |
| T2 | 4%               | T8  | 2%               |  |
| Т3 | 4%               | Т9  | 2%               |  |
| T4 | 4.4%             | T10 | 3%               |  |
| T5 | 4%               | T11 | 2%               |  |
| T6 | 2%               | T12 | 3.4%             |  |

# 2.2. CD Model

The other component needed to include CD variation within SSTA is a CD model, or a model for  $L_g$  in (2). As stated in Section 1, for any gate in the library at the same location, current SSTA frameworks typically model CD as a single RV and all devices within a standard cell vary identically. Process engineers determine this distribution by fabricating different test structure geometries, and measuring the samples across a number of dies and wafers. These measurements are then treated as the discrete samples that comprise the single distribution of gate length –  $L_g$ . Once  $L_g$  is known, this model can also be extended to include spatial correlation in CD. Our SSTA implementation of this model is referred to as the "Single-CD Library" model and is discussed in more detail in Section 4.1.1.

# **3. PROPOSED TRANSISTOR-SPECIFIC MODEL**

The probabilistic and systematic components of lithography variation due to exposure and defocus exist because of the role they play in the manufacturing process. Exposure and defocus in a lithographic system determine the amount of photoresist that is developed. Therefore, any deviation in exposure or defocus will lead to over- or under-development of the photoresist. This causes geometries to differ in stability and roughness, as well as deviate from the intended size [9–11]. The over- or under-development at a certain area of the die will cause probabilistic shifts in mean CD, however, the direction and magnitude of those shifts is dependent on neighborhood or context, which is systematic in nature. To illustrate this problem, we took the same standard cell (with OPC) in Figure 1 and ran a printed-image simulation at nominal exposure and defocus. The standard cell layout, optical proximity correction (OPC) recipe, and lithography system setup were all obtained from an industrial 90nm process. All geometries began with the same drawn CD, however, even when lithography printed-image simulation was run at nominal exposure and defocus settings, context dependencies arose. Table 1 contains the percentage deviation of each CD from the maximum CD (the CD for the transistor labeled "T1" in Figure 1). From this table it is clear that even at nominal settings where OPC is typically most effective, within-cell context dependencies emerge that lead to deviations in CD of ~4%. As stated in Section 1, these within-cell CD deviations are caused by a number of layout characteristics like geometry-to-geometry distance, line-end overhang, and distance to contact landings. Since there are hundreds of standard cells in a typical library and each cell will have different orientations/spacings of geometries, the need for a lithography-aware CD model is apparent.

Present-day, non-lithography-aware CD models can be viewed as the most rudimentary variation model: only one random variable is needed. The most complex model, on the other hand, would involve having an RV for each transistor in the library. In the 90nm library that we used, this would mean that SSTA would have to keep track of thousands of random variables for CD variation alone, which is unacceptable. However, in our work we hypothesized that since there are two main underlying components of CD variation, exposure and defocus, CD could be modeled as a function of ~2 components. Furthermore, when we performed printed-image simulations (over the entire range of exposure and defocus) on all of the standard cells in our library,





Figure 5. Normalized CD Distribution PCA Coefficients

we discovered that most of the transistor CD distributions were highly correlated (>0.9), as expected, since the distributions were created by two common variation sources. These experiments suggested that a compression technique, such as Principal Component Analysis (PCA) [12], would allow us to reduce the number of RV's by >3 orders of magnitude, while still preserving the actual correlations that arise due to the common variation sources and layout commonalities.

To test our theory, we used lithography-aware simulations (discussed in Section 3.3) to generate CD distributions for every device in our library (all transistors within every standard cell). These distributions were then treated as distinct RV's and decomposed using PCA. We determined that ~99.9% of the total variance of each RV could be captured with the first two principal components. This fact is further illustrated in Figure 5, which shows a scatterplot of the first 60 PCA coefficients (out of a total of ~200) for an arbitrary transistor in our library. As can be seen, the first two components. This means that instead of using ~200 RV's to accurately model CD variation for every device in our library, we only need 2.

The PCA compression technique is used as the basis of our Transistor-Specific (Xtor-Spfc) CD and delay models. They are described next in Section 3.1. Section 3.2 outlines the entire Xtor-Spfc characterization flow, while Section 3.3 briefly discusses the custom lithography-aware simulator used in our experiments.

### 3.1. Transistor-Specific CD and Delay Models

Since we use PCA in our CD model, CD can be analytically expressed as:

$$L_{jk} = \mu_{L_{jk}} + a_{jk}X_1 + b_{jk}X_2$$
  

$$a_{jk} = \sigma_{L_{jk}}v_{jk,1}\sqrt{\lambda_1}$$
  

$$b_{jk} = \sigma_{L_{jk}}v_{jk,2}\sqrt{\lambda_2}$$
(3)

In (3),  $L_{jk}$  is the CD distribution of a particular transistor, *j*, contained in the  $k^{th}$  standard cell of the library. Specifically,  $\mu_{L_{jk}}$  is the mean CD of the device (determined during Litho-Aware simulation),  $a_{jk}$  and  $b_{jk}$  are the first two PCA coefficients, and  $X_1$  and  $X_2$  are the principal components, which are standard, normal RV's. The coefficients,  $a_{jk}$  and  $b_{jk}$ , are calculated as described in (3);  $\sigma_{L_{jk}}$  is the standard deviation of the device CD,  $v_{jk,1}$  and  $v_{jk,2}$ are the  $jk^{th}$  element in the first and second eigenvectors, respectively, while  $\lambda_1$  and  $\lambda_2$  are the first and second eigenvalues. For a more detailed theoretical description of PCA we refer the reader to [12]. This model is referred to as the Xtor-Spfc CD model for the remainder of the paper.

The Xtor-Spfc CD model is used directly in (2) to generate our Xtor-Spfc delay model. To determine which  $L_{jk}$  is actually used in the delay model, we merely choose the transistor associated with the specific pin-to-pin transition in question. For instance, if we're



Figure 6. Proposed Transistor-Specific Delay Model

characterizing the rising delay transition of a minimum-sized inverter, then the  $L_{jk}$  that we use in the model is the PMOS CD distribution (assuming single input switching). If the device happens to have multiple fingers, then we choose any one of the devices (since all devices are highly correlated).

### 3.2. Transistor-Specific Characterization

The proposed Transistor-Specific model characterization flow is presented in Figure 6. It uses the Litho-Aware simulator, depicted in Figure 7 and described in Section 3.3, to determine the CD distributions for all of the transistors contained in every standard cell in our library. Then it runs PCA on the entire set of CD distributions (each CD distribution represents a distinct RV) and calculates (3), our Xtor-Spfc CD equation, based on the first two principal components. Not only are these CD equations used directly within SSTA in determining the delay distributions, but they are also used to generate gate length samples used in the HSPICE delay sensitivity characterization (the  $\hat{L}_{jk}$ 's are used as the  $L_g$ 's in the pseudo-code in Figure 3). Because the CD distributions, the  $L_{ik}$ 's, are independent of the output loading and input slew, we only need to run the Xtor-Spfc model generation once per standard cell. When all of the CD distributions have been simulated for every cell in the library, a limited set of samples is chosen to obtain an accurate quadratic fit for delay. As a result, the runtime of the proposed Xtor-Spfc model is of the same order as existing approaches.

It is important to note that in practice, exposure and defocus in a lithographic system gradually varies from one die location to the next. As a result, both exposure and defocus variations tend to affect closely spaced devices in a similar manner, making them more likely to have comparable CD's than those placed far apart. Therefore, it is important to capture spatial dependencies between the CD variation of two devices in addition to characterizing the proximity dependence of layout. Process engineers currently utilize test structures to determine the correlations that exist in a given process. Similarly, our model could use a test-structurebased method of extracting correlation. The test structures themselves would consist of a few representative standard cells chosen from our design library. These library cells would be replicated across the die and then fabricated at a manufacturing facility. Much like existing procedures, our RV's  $X_1$  and  $X_2$  would be extracted from the manufactured data at each location in a die, across all dies, allowing both the intra- and inter-die correlation to be calculated.

#### 3.3. Litho-Aware Simulation

Our Transistor-Specific characterization is built around a number of industry IC design tools. A flow chart for the simulator is



shown in Figure 7. The Litho-Aware simulator receives a graphic data system (GDS) layout file as the main input, which contains the drawn layout of the intended design. In our library characterization, all standard cell polysilicon has industrial OPC's, but the tool is also capable of adding corrections prior to running the printed image simulation. Next, it conditionally places neighboring geometries adjacent to all edges of the circuit under simulation so that context dependencies can be analyzed. Then, using Mentor Graphics' Calibre, a printed image simulation is performed on either the original GDS or the modified, context-inclusive GDS. The simulated printed image is then written to a new GDS file, which is input to an extraction tool. We use Calibre again, as well as an industrial extraction tool, to extract the spice netlist and obtain actual gate length values. After running our tool, there are two outputs at the user's disposal: the printed image GDS and the extracted netlists.

#### 4. **RESULTS**

During our library characterization, we first analyzed the gate length and delay distributions, and then explored the accuracy of three delay models: the Single-CD Library (SCDL), Cell-Specific (Cell-Spfc), and Transistor-Specific (Xtor-Spfc) models. Both the SCDL and Xtor-Spfc models were discussed previously in Sections 2.2 and 3.1, respectively. The Cell-Spfc model is a variant of the SCDL model and is described in Section 4.1.2. The accuracy of each of the models was found by comparing its standard deviation for delay to our "Golden" result. The Golden result for each standard cell is a discrete distribution that consists of 10,000 delay samples. Each delay sample corresponds to a printed image simulation that has been extracted and characterized in HSPICE at a particular exposure/defocus setting. Each exposure/defocus pair is sampled from the joint-normal, bivariate distribution of exposure and defocus. As stated earlier, this work utilized an industrial 90nm process and an industrial lithography recipe (with industrial OPC). Since 90nm is a stable process, and variation is expected to increase as we move from 65nm to 45nm and beyond, we performed our library characterization, model generation, and analysis twice. In the first iteration, exposure and defocus were varied according to typical 90nm process values, but in the second iteration we increased variability so as to mimic the effects of moving from a 90nm lithographic process to 65nm. The scaling factors used to increase variability were obtained from [13]. For the remainder of this work, we refer to the typical 90nm variation as "90nm" or small variation and the scaled 90nm variation as "pseudo-65nm" or large variation. The authors would like to note that this experimental procedure was chosen due to the fact that the 65nm data needed for this work (standard cells, device models, and process data) was unavailable when this research was conducted. Thus, future work includes running our experiments again at next-generation process nodes when the data becomes available.

The remainder of this section is divided as follows: Section 4.1 begins by describing our experimental setup. Then, Section 4.2 discusses the general trends observed in the CD and delay distri-

butions, and includes a brief discussion of observed within-cell context dependencies. Lastly, Section 4.3 includes our model comparisons for both variability cases. Note that in either case we did not include neighborhood characterization between cells because industry sources informed us that polysilicon geometries would be more or less regular from the 45nm process node onward, reducing neighborhood effects [13]. Thus, we leave neighborhood analysis as future work.

#### 4.1. Experimental Setup

Our experimental results compare three different gate delay models: the SCDL, Cell-Spfc, and Xtor-Spfc models. Refer to Section 3 for the details pertaining to our proposed Transistor-Specific models.

# 4.1.1. Single-CD Library Model

For this work, we required a representative model that would demonstrate the amount of error incurred by ignoring within-cell and cell-to-cell lithography effects. This model is based on the current SSTA approach discussed in Section 2.2 and is referred to as the Single-CD Library model, or SCDL, for the remainder of the paper. Essentially, our custom Litho-Aware simulator (described in Section 3.3) samples a joint-normal, bivariate distribution of exposure and defocus and determines all of the transistor CD distributions for every standard cell in the library. Next, all of the samples from the transistor CD distributions are collected into one RV. This RV, L, represents the single CD distribution mentioned in Section 2.2, and we use the moments of L to derive  $L_g$ .

$$L_{\sigma} = \mu_L + \sigma_L X_1 \tag{4}$$

Here,  $\mu_L$  and  $\sigma_L$  are the mean and standard deviation, respectively, of the single gate length distribution, *L*, and *X<sub>I</sub>* is a standard, normal RV (with zero mean and unit variance). Finally, the delay distribution for each cell is calculated by substituting *L<sub>g</sub>* into (2).

## 4.1.2. Cell-Specific Model

In addition to the Transistor-Specific model proposed in Section 3, we also explored a variant of the SCDL model, which we refer to as the "Cell-Specific" (Cell-Spfc) model. This model uses the same basic procedure described in Section 4.1.1, except for one key difference: instead of collecting CD distributions from the entire library into one RV, CD distributions from each cell are collected into a local gate length distribution. For example, assume for the moment that the cell we are characterizing is a minimum-sized, 2-input NAND gate with a total of four transistors:  $NMOS_1$ ,  $NMOS_2$ ,  $PMOS_1$ , and  $PMOS_2$ . After Litho-Aware simulation, all of the CD distribution samples for these four transistors are collected into one RV,  $L_{NAND2}$ , and we then calculate  $L_{g,NAND2}$  as seen in (5).

$$L_{g,NAND2} = \mu_{L_{NAND2}} + \sigma_{L_{NAND2}} X_1$$
(5)

Therefore, in the Cell-Spfc model, each standard cell within the library will have a different  $L_{g,CELL}$ , but similar to the SCDL model, all transistors within the same cell will have identical  $L_{g,CELL}$ 's. These distinct  $L_{g,CELL}$ 's are then substituted into (2) on a cell-by-cell basis.

#### 4.2. CD and Delay Distributions

Using our characterization tool, we analyzed 22 different standard cells under varying amounts of exposure and defocus. We discovered that with the pseudo-65nm process variation setup, our library had an average gate length distribution  $3\sigma/\mu$  of ~18% and an average delay distribution  $3\sigma/\mu$  of ~15%. Additionally, we verified the effect that layout topology had on the CD and delay distributions. Our experiments proved that both the CD and delay distributions were different for transistors within the same cell, as well as for transistors from two different cell types. For example, Figure 8 contains the probability density function (PDF) for a 4finger, 2-input NOR gate (composed of 16 transistors total). Included in the plot are 3 of the 16 CD distributions: two NMOS



Figure 8. PDF for Various Transistors in a 4-finger, 2-input NOR gate

and one PMOS. All three transistors are normalized to the PMOS device. From this figure, it is apparent that each of these distributions differ in mean and standard deviation by a few percent, thereby confirming that ignoring within-cell variation is inaccurate. The amount of inaccuracy is quantified in the following section.

#### 4.3. Model Comparison

As mentioned previously, the three models discussed in Section 4.1 are compared in this section and each model fits delay as a quadratic function of CD, as in (2). We found that when comparing the three delay models to our Golden result, each model had about the same average error in mean (~1%), but the error in standard deviation ( $\sigma$ ) differed considerably. The resulting error in  $\sigma$  for each model is displayed in Table 2. Both variation cases – Pseudo-65nm and 90nm – are included in Table 2, however, unless otherwise mentioned, the remaining results discussed in this paper pertain to the Pseudo-65nm data.

From Table 2, it is apparent that both of our delay models, the Cell-Spfc and Xtor-Spfc, are more accurate than the current SSTA delay model, SCDL. The SCDL delay model has an average error in  $\sigma$  of 11.8%, and has a worst case error of 39%. Our proposed delay model, the Xtor-Spfc model, reduces average  $\sigma$  error by 2.9X and has a worst case error of ~16% (a 2.4X improvement).

In order to visually portray the accuracy improvement that you achieve by using either the Cell-Spfc model or the Xtor-Spfc model, Figures 9 and 10 are included. These figures show the standard deviation of delay for the three models plotted against the golden standard deviation. In these plots, one point represents a model's standard deviation for one input-to-output propagation delay distribution (there are ~50 different pin-to-pin transitions for the 22 standard cells in our library). Ideally, we would like the models to fall directly on the black line (y = x), where *Model*  $\sigma =$ Golden  $\sigma$ . It is clear that the SCDL model is consistently furthest from the line, followed by the Cell-Spfc, while the Xtor-Spfc is the most accurate. This confirms what we observed in Table 2. If we look at two example CDF graphs in Figure 11 and Figure 12, we observe similar results. The Xtor-Spfc model and Cell-Spfc models follow the Golden result more closely than the SCDL model. However, here the shortcomings of the Cell-Spfc model Table 2. Absolute Error in Standard Deviation

(from Golden Distribution)

|                 | Pseudo<br>(Avg. σ/μ | 0-65nm<br>u = 4.9%) | 90nm<br>(Avg. σ/μ = 2.9%) |       |
|-----------------|---------------------|---------------------|---------------------------|-------|
|                 | % Err               | or in $\sigma$      | % Error in $\sigma$       |       |
|                 | Rise                | Fall                | Rise                      | Fall  |
| SCDL - Avg      | 10.9%               | 12.7%               | 14.3%                     | 15.0% |
| Cell-Spfc - Avg | 8.7%                | 11.4%               | 9.3%                      | 9.3%  |
| Xtor-Spfc - Avg | 3.4%                | 4.7%                | 2.2%                      | 1.4%  |
| SCDL - WC       | 38.0%               | 39.4%               | 41.7%                     | 38.3% |
| Cell-Spfc - WC  | 38.2%               | 39.3%               | 36.0%                     | 30.8% |
| Xtor-Spfc -WC   | 16.1%               | 8.7%                | 15.4%                     | 8.8%  |



become apparent. When we are dealing with a simple standard cell, such as the minimum-sized inverter in Figure 11, the Cell-Spfc model is almost as accurate as the Xtor-Spfc model. But when the models are used on more complex cells, such as the AND/OR Invert gate in Figure 12 or standard cells with fingered transistors, then Cell-Spfc has nearly as much error as SCDL, since it collects many within-cell CD distributions into one RV, similar to the SCDL model.

#### 5. CONCLUSION

In this work we proposed a transistor-specific CD model and its corresponding delay model. We then used a custom Litho-Aware simulation tool to compare our models to existing SSTA models, and determined the absolute error of our Xtor-Spfc CD and delay models. We found that the modern SSTA delay modeling approach is error-prone and can sometimes lead to twice as much error as total variation. All in all, our proposed SSTA delay model achieves average error reductions in standard deviation of ~3X when compared to current models and can be easily incorporated in existing SSTA frameworks.

#### Acknowledgement

The authors would like to thank Freescale Semiconductor, Mentor Graphics, Semiconductor Research Corporation, and the University of Michigan for their support in this work.

#### References

- [1] H. Chang and S.S. Sapatnekar, "Statistical Timing Analysis under Spatial Correlations," in IEEE Trans. on CAD, Vol. 24, Issue 9, pp. 1467-1482, Sept. 2005.
- [2] C. Visweswariah, et al., "First-Order Incremental Block-Based Statis-tical Timing Analysis," in *IEEE Trans. on CAD*, Vol. 25, Issue 10, pp. 2170-2180, Oct. 2006.
- [3] A. Agarwal, et al., "Statistical Delay Computation Considering Spa-
- tial Correlations," in *Proc. of ASP-DAC*, pp. 271-276, Jan. 2003. [4] H. Chang, et al., "Parameterized Block-based Statistical Timing Analysis with Non-Gaussian Parameters, Nonlinear Delay Functions,' ' in Proc. of DAC, June 2005.
- [5] L. Zhang, et al., "Correlation-Preserved Non-Gaussian Statistical Timing Analysis with Quadratic Timing Model," in Proc. of DAC, pp. 83-88, June 2005.
- [6] K. Cao, J. Hu, S. Dobre, "Standard Cell Characterization Considering Lithography Induced Variations," in Proc. of DAC, pp. 801-804, July 2006
- [7] P. Gupta, et al., "Modeling of Non-Uniform Device Geometries for Post-Lithography Circuit Analysis," in Proc. of SPIE, Vol. 6156, 61560U, March 2006.
- [8] M. Choi and L. Milor, "Impact on Circuit Performance of Deterministic Within-Die Variation in Nanoscale Semiconductor Manufacturing," in IEEE Trans. on CAD, Vol. 25, Issue 7, pp. 1350-1367, July 2006

Rise Delay Standard Deviation – Model vs. Golden



Figure 10. Rise Delay Standard Deviation Comparison - Normalized (Pseudo-65nm Variation)

- [9] A. Pawloski, et al., "Line edge roughness and intrinsic bias for two methacrylate polymer resist systems," in J. Microlith., Microfab., Microsyst., Vol. 5, Issue 2, 023001, May 2006.
- [10]M. D. Stewart, et al., "Diffusion-induced line-edge roughness," in Proc. of SPIE, Vol. 5039, pp. 415-422, June 2003.
- [11]T. Yamaguchi and H. Namatsu, "Generation mechanism of surface roughness in resists: free volume effect on surface roughness," in Proc. of SPIE, Vol. 4690, pp. 921-928, July 2002.
- [12]I.T. Jolliffe, Principal Component Analysis, Springer Series in Statistics, New York, 1986.
- [13]Personal Communication with Andres Torres.



Figure 11. Minimum-sized Inverter Fall Delay Transition CDF (90nm Variation)



