3.4 Physical Design, Extraction and Timing Analysis

Printer-friendly version PDF version

Date: Tuesday, March 26, 2019
Time: 14:30 - 15:30
Location / Room: Room 4

Chair:
Patrick Groeneveld, Cadence Design Systems, US, Contact Patrick Groeneveld

Co-Chair:
Po-Hung Lin Mark, National Chung Cheng University, TW, Contact Mark Po-Hung Lin

The first paper uses multivariate linear regression to increase the efficiency of corner based timing analysis. The second paper proposes an approach for zero skew clock tree construction yielding superior wirelength performance. The following two papers present macro placement algorithms: one adopting a dataflow driven approach, the other using a routability driven convolutional neural network predictor. The last paper addresses issues on reusability and reproducibalibty in parallelized random walk based capacitance extraction.

TimeLabelPresentation Title
Authors
14:303.4.1"UNOBSERVED CORNER" PREDICTION: REDUCING TIMING ANALYSIS EFFORT FOR FASTER DESIGN CONVERGENCE IN ADVANCED-NODE DESIGN
Speaker:
Uday Mallappa, University of California San Diego, US
Authors:
Andrew Kahng1, Uday Mallappa2, Lawrence Saul3 and Shangyuan Tong2
1UCSD, US; 2University of California San Diego, US; 3UC San Diego, US
Abstract
With diminishing margins for leading-edge products in advanced technology nodes, design closure and accuracy of timing analysis have emerged as serious concerns. A significant portion of design turnaround time is spent on timing analysis at combinations of process, voltage and temperature (PVT) corners. At the same time, accurate, signoff-quality timing analysis is desired during place-and-route and optimization steps, to avoid loops in the flow as well as overdesign that wastes area and power. We observe that timing results for a given path at different corners will have strong correlations, if only as a consequence of physics of devices and interconnects. We investigate a data-driven approach, based on multivariate linear regression, to predict the timing analysis at unobserved corners from analysis results at observed corners. We use a simple backward stepwise selection strategy to choose which corners to observe and which to predict. In order to accelerate convergence of the design process, the model must yield predicted values (from analysis at a limited number of observed corners) that are sufficiently accurate to substitute for unobserved ones. Our empirical results indicate that this is likely the case. With a 1M-instance example in foundry 16nm enablement, we obtain a model based on 10 observed corners that predicts timing results at the remaining 48 unobserved corners with less than 0.5% relative root mean squared error, and 99% of the model's relative prediction errors are less than 0.6%.
15:003.4.2DIM SUM: LIGHT CLOCK TREE BY SMALL DIAMETER SUM
Speaker:
Gengjie Chen, The Chinese University of Hong Kong, HK
Authors:
Gengjie Chen and Evangeline Young, The Chinese University of Hong Kong, HK
Abstract
By retrospecting the classical deferred-merge embedding (DME) algorithm, we found an intrinsic relationship between the zero-skew tree (ZST) problem and the hierarchical clustering (HC) problem. To be more specific, the wire length of a ZST is proved a linear function of the sum of diameters of its corresponding HC. With this new insight, an effective O(n log n)-time O(1)-approximation algorithm and an optimal dynamic programming for ZST are designed. Using the ZST construction black box and a linear-time optimal tree decomposition algorithm, an improved algorithm for constructing the bounded-skew tree (BST) is derived. In the experiment, our approach shows superior wire length compared with previous methods for both ZST and BST.
15:153.4.3ROUTABILITY-DRIVEN MACRO PLACEMENT WITH EMBEDDED CNN-BASED PREDICTION MODEL
Speaker:
Yu-Hung Huang, National Taiwan University of Science and Technology, TW
Authors:
Yu-Hung Huang1, Zhiyao Xie2, Guan-Qi Fang1, Tao-Chun Yu1, Haoxing Ren3, Shao-Yun Fang1, Yiran Chen2 and Jiang Hu4
1National Taiwan University of Science and Technology, TW; 2Duke University, US; 3NVIDIA Corporation, US; 4Texas A&M University, US
Abstract
With the dramatic shrink of feature size and the advance of semiconductor technology nodes, numerous and complicated design rules need to be followed, and a chip design can only be taped-out after passing design rule check (DRC). The high design complexity seriously deteriorates design routability, which can be measured by the number of DRC violations after the detailed routing stage. In addition, a modern large-scaled design typically consists of many huge macros due to the wide use of intellectual properties (IPs). Empirically, the placement of these macros greatly determines routability, while there exists no effective cost metric to directly evaluate a macro placement because of the extremely high complexity and unpredictability of cell placement and routing. In this paper, we propose the first work of routability-driven macro placement with deep learning. A convolutional neural network (CNN)-based routability prediction model is proposed and embedded into a macro placer such that a good macro placement with minimized DRC violations can be derived through a simulated annealing (SA) optimization process. Experimental results show the accuracy of the predictor and the effectiveness of the macro placer.
15:303.4.4RTL-AWARE DATAFLOW-DRIVEN MACRO PLACEMENT
Speaker:
Alexandre Vidal Obiols, Polytechnic University of Catalonia, ES
Authors:
Alex Vidal-Obiols1, Jordi Cortadella1, Jordi Petit1, Marc Galceran-Oms2 and Ferran Martorell2
1UPC, ES; 2eSilicon EMEA, Barcelona, ES
Abstract
When RTL designers define the hierarchy of a system, they exploit their knowledge about the conceptual abstractions devised during the design and the functional interactions between the logical components. This valuable information is often lost during physical synthesis. This paper proposes a novel multi-level approach for the macro placement problem of complex designs dominated by macro blocks, typically memories. By taking advantage of the hierarchy tree, the netlist is divided into blocks containing macros and standard cells, and their dataflow affinity is inferred considering the latency and flow width of their interaction. The layout is represented using slicing structures and generated with a top-down algorithm capable of handling blocks with both hard and soft components, aimed at wirelength minimization. These techniques have been applied to a set of large industrial circuits and compared against both a commercial floorplanner and handcrafted floorplans by expert back-end engineers. The proposed approach outperforms the commercial tool and produces solutions with similar quality to the best handcrafted floorplans. Therefore, the generated floorplans provide an excellent starting point for the physical design iterations and contribute to reduce turn-around time significantly.
15:453.4.5REALIZING REPRODUCIBLE AND REUSABLE PARALLEL FLOATING RANDOM WALK SOLVERS FOR PRACTICAL USAGE
Speaker:
Mingye Song, Tsinghua University, CN
Authors:
Mingye Song1, Zhezhao Xu1, Wenjian Yu1 and Lei Yin2
1Tsinghua University, CN; 2ANSYS Inc., US
Abstract
Capacitance extraction or simulation has become a challenging problem in the computer-aided design of integrated circuits (ICs), flat panel display, etc. Due to its scalability and reliability, the parallel floating random walk (FRW) based capacitance solver is widely used. In practice, the parallel FRW algorithms involve an issue of reproducibility and may spend a lot of time in the scenarios requesting high accuracy. To relieve these issues, techniques are developed in this paper to enhance the reproducibility and reusability of the parallel FRW based simulation. With them we ensure that same result is reproduced while rerunning the parallel FRW solver with same setting. Besides, a ``jump start'' feature is implemented to reduce the total runtime of simulating same structure with multiple accuracy criteria. Experiments on shared-memory and distributed-memory platforms have validated the effectiveness of the presented techniques. Compared with a synchronization based approach ensuring the reproducibility, the proposed technique with static workload allocation brings up to 4.8X more parallel speedup while sacrificing nothing.
16:01IP1-19, 136ACCURATE WIRELENGTH PREDICTION FOR PLACEMENT-AWARE SYNTHESIS THROUGH MACHINE LEARNING
Speaker:
Daijoon Hyun, KAIST, KR
Authors:
Daijoon Hyun, Yuepeng Fan and Youngsoo Shin, KAIST, KR
Abstract
Placement-aware synthesis, which combines logic synthesis with virtual placement and routing (P&R) to better take account of wiring, has been popular for timing closure. The wirelength after virtual placement is correlated to actual wirelength, but correlation is not strong enough for some chosen paths. An algorithm to predict the actual wirelength from placement-aware synthesis is presented. It extracts a number of parameters from a given virtual path. A handful of synthetic parameters are compiled through linear discriminant analysis (LDA), and they are submitted to a few machine learning models. The final prediction of actual wirelength is given by the weighted sum of prediction from such machine learning models, in which weight is determined by the population of neighbors in parameter space. Experiments indicate that the predicted wirelength is 93% accurate compared to actual wirelength; this can be compared to conventional virtual placement, in which wirelength is predicted with only 79% accuracy.
16:02IP1-20, 602A MIXED-HEIGHT STANDARD CELL PLACEMENT FLOW FOR DIGITAL CIRCUIT BLOCKS
Speaker:
Ting-Chi Wang, National Tsing Hua University, TW
Authors:
Yi-Cheng Zhao1, Yu-Chieh Lin1, Ting-Chi Wang1, Ting-Hsiung Wang2, Yun-Ru Wu2, Hsin-Chang Lin2 and Shu-Yi Kao2
1National Tsing Hua University, TW; 2Realtek Semiconductor Corp., TW
Abstract
Standard cell libraries are usually designed with different cell heights (e.g., 9-track and 12-track cell libraries in a 28nm node) while each library contains standard cells of the same height. A standard cell of larger height provides better performance but inversely has larger area and consumes more power than one with smaller height. As a result, a smart strategy for designing a digital circuit block should try to mix the usage of cells with different heights for achieving better design quality. In this paper, we present a mixed-height standard cell placement flow for digital circuit blocks. To our best knowledge, commercial tools currently do not support this type of flow in a fully automated manner. In our placement flow, we leverage a commercial placement tool and integrate it with several new point tools. Promising experimental results are reported to demonstrate the efficacy of our placement flow.
15:30End of session
16:00Coffee Break in Exhibition Area



Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the ""Lunch Area"" to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

  • Coffee Break 10:30 - 11:30
  • Lunch Break 13:00 - 14:30
  • Awards Presentation and Keynote Lecture in ""TBD"" 13:50 - 14:20
  • Coffee Break 16:00 - 17:00

Wednesday, March 27, 2019

  • Coffee Break 10:00 - 11:00
  • Lunch Break 12:30 - 14:30
  • Awards Presentation and Keynote Lecture in ""TBD"" 13:30 - 14:20
  • Coffee Break 16:00 - 17:00

Thursday, March 28, 2019

  • Coffee Break 10:00 - 11:00
  • Lunch Break 12:30 - 14:00
  • Keynote Lecture in ""TBD"" 13:20 - 13:50
  • Coffee Break 15:30 - 16:00