4.6 Fault modeling, test generation and diagnosis

Printer-friendly version PDF version

Date: Tuesday 28 March 2017
Time: 17:00 - 18:30
Location / Room: 5A

Chair:
Stephan Eggersgluss, University of Bremen, DE

Co-Chair:
Martin Keim, Mentor, DE

This session includes a presentation about new SAT-based ATPG techniques for robust initialization of transistor stuck-open faults. Further, a diagnosis method for arbiter physical unclonable functions to identify systematic manufacturing issues is presented. The last paper analyzes failure modes of Flash memories and proposes suitable fault models.

TimeLabelPresentation Title
Authors
17:004.6.1(Best Paper Award Candidate)
FAST AND WAVEFORM-ACCURATE HAZARD-AWARE SAT-BASED TSOF ATPG
Speaker:
Jan Burchard, University of Freiburg, DE
Authors:
Jan Burchard1, Dominik Erb1, Adit D. Singh2, Sudhakar M. Reddy3 and Bernd Becker1
1University of Freiburg, DE; 2Auburn University, US; 3University of Iowa, US
Abstract
Opens are known to be one of the predominant defects in nanoscale technologies. Especially with an increasing number of complex cells in today's VLSI designs intra-gate opens are becoming a major problem. The generation of tests for these faults is hard, as the timing of the circuit needs to be considered accurately to prevent the invalidation of the generated tests through hazards. Current test generation methods, including new cell aware tests that explicitly target open defects, ignore the possibility of hazard caused test invalidation. Such tests can fail to detect a significant fraction of the targeted opens. In this work we present a waveform-accurate hazard-aware test generation approach to target intra-gate opens. Our methodology is based on a SAT-based encoding and allows the generation of tests guaranteed to be robust against hazards. Experimental results for large benchmarks mapped to the state-of-the-art NanGate 45nm cell library including complex cells show the test generation efficiency of the proposed method. Large circuits were efficiently handled -- even without the use of fault simulation. Our experiments show that on average, about 10.92% of conventional hazard-unaware tests will fail to detect the targeted opens because of test invalidation -- these are reliably detected by our new test generation methodology. Importantly, our approach can also be applied to improve the effectiveness of commercial cell aware tests.

Download Paper (PDF; Only available from the DATE venue WiFi)
17:304.6.2FAULT DIAGNOSIS OF ARBITER PHYSICAL UNCLONABLE FUNCTION
Speaker:
Yu Hu, Institute of Computing Technology, Chinese Academy of Sciences, CN
Authors:
Jing Ye1, Qingli Guo2, Yu Hu1 and Xiaowei Li1
1State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, CN; 2State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, CN
Abstract
Physical Unclonable Function (PUF) has broad application prospects in the field of hardware security. If faults happen in PUF during manufacturing, the security of whole chip will be threatened. Fault diagnosis plays an important role in the yield learning process. However, since different manufactured PUFs with the same design have different Challenge-Response Pairs (CRPs), which cannot be predicted, the traditional fault diagnosis method based on comparing the fault-free responses of a design and the failing responses of chips is no longer suitable for diagnosing PUF. Therefore, this paper proposes a fault diagnosis method toward classic arbiter PUF. The stuck-at faults and the delay faults are considered. Based on the expected uniformity of arbiter PUF, a diagnostic challenge generation method and a corresponding CRP analysis method are proposed to distinguish faults within the arbiter PUF. Experimental results show that the diagnostic accuracy achieves 100.0% with good diagnostic resolution.

Download Paper (PDF; Only available from the DATE venue WiFi)
18:004.6.3FPGA-BASED FAILURE MODE TESTING AND ANALYSIS FOR MLC NAND FLASH MEMORY
Speaker:
Fei Wu, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, CN
Authors:
Meng Zhang1, Fei Wu1, Qian Xia1, He Huang1, Jian Zhou2 and Changsheng Xie1
1Huazhong University of Science and Technology, CN; 2University of Central Florida, US
Abstract
With the improvement of flash memory storage density, data reliability and flash lifetime are decreased. Error correction codes (ECC) and error management schemes can boost both reliability and lifetime. However, in order to develop effective fault tolerance algorithms and management solutions, it is very necessary to have a more profound understanding of failure modes of flash memory. To enable such understanding, we design an experimental platform and scheme to clearly investigate flash failure modes. This paper examines various failure modes occurring at 2x-nm MLC NAND flash technologies, such as page allocation scheme-based program interference (PASBPI) errors (i.e., different page allocation schemes mean data can be programmed into flash pages in different ways, which can lead to different program interference errors), write errors of the least significant bit (LSB) and the most significant bit (MSB) and different data pattern-based read interference errors (i.e., different data values programmed into flash pages can cause differential read interference errors). We analyze these observed failure modes and explain why they exist. We hope it is helpful to understand these discovered failure modes to propose effective fault tolerance and error management algorithms.

Download Paper (PDF; Only available from the DATE venue WiFi)
18:30IP2-10, 342(Best Paper Award Candidate)
RETRODMR: TROUBLESHOOTING NON-DETERMINISTIC FAULTS WITH RETROSPECTIVE DMR
Speaker:
Ting Wang, The Chinese University of Hong Kong, HK
Authors:
Ting Wang1, Yannan Liu1, Qiang Xu1, Zhaobo Zhang2, Zhiyuan Wang2 and Xinli Gu2
1The Chinese University of Hong Kong, HK; 2Huawei Technologies, Inc., US
Abstract
The most notorious faults for diagnosis in post-silicon validation are those that manifest themselves in a non-deterministic manner with system-level functional tests, where errors randomly appear from time to time even when applying the same workloads. In this work, we propose a novel diagnostic framework that resorts to dual-modular redundancy (DMR) for troubleshooting non-deterministic faults, namely RetroDMR. To be specific, we log the essential events (e.g., the sequence of thread migration) in the faulty run to record the mapping relationship between threads and their corresponding execution units. Then in the following diagnosis runs, we apply redundant multithreading (RMT) technique to reduce error detection latency, while at the same time we try to follow the thread migration sequence of the original run whenever possible. By doing so, RetroDMR significantly improves the reproduction rate and diagnosis resolution for non-deterministic faults, as demonstrated in our experimental results.

Download Paper (PDF; Only available from the DATE venue WiFi)
18:31IP2-11, 710CRITICAL PATH - ORIENTED THERMAL AWARE X-FILLING FOR HIGH UN-MODELED DEFECT COVERAGE
Speaker:
Fotios Vartziotis, Computer Engineering, T.E.I. of Epirus, Greece, GR
Authors:
FOTIOS VARTZIOTIS1 and Chrysovalantis Kavousianos2
1TEI of Epirus, University of Ioannina, GR; 2Department of Computer Science and Engineering, University of Ioannina, GR
Abstract
The thermal activity during testing can be considerably reduced by applying power-oriented filling of the unspecified bits of test vectors. However, traditional power-oriented X-fill methods do not correlate the thermal activity with delay failures, and they consume all the unspecified bits to reduce the power dissipation at every region of the core. Therefore, they adversely affect the un-modeled defect coverage of the generated test vectors. The proposed method identifies the unspecified bits that are more critical for delay failures, and it fills them in such a way as to create a thermal safe neighborhood around the most critical regions of the core. For the rest of the unspecified bits a probabilistic model based on output deviations is adopted to increase the un-modeled defect coverage of the test vectors. Experimental results show that the thermal activity and the inter-connection delays of critical regions of the core are comparable to those of the power-oriented X-fill methods, while the un-modeled defect coverage is as high as that of the random-fill method.

Download Paper (PDF; Only available from the DATE venue WiFi)
18:32IP2-12, 814A COMPREHENSIVE METHODOLOGY FOR STRESS PROCEDURES EVALUATION AND COMPARISON FOR BURN-IN OF AUTOMOTIVE SOC
Speaker:
Paolo Bernardi, Politecnico di Torino, IT
Authors:
Paolo Bernardi1, Davide Appello2, Giampaolo Giacopelli2, Alessandro Motta2, Alberto Pagani2, Giorgio Pollaccia3, Christian Rabbi2, Marco Restifo1, Priit Ruberg4, Ernesto Sanchez1, Claudio Maria Villa2 and Federico Venini1
1Politecnico di Torino, IT; 2STMicroelectronics, IT; 3STMicroelectonics, IT; 4Tallinn University of Technology, EE
Abstract
Environmental and electrical stress phases are commonly applied to automotive devices during manufacturing test. The combination of thermal and electrical stress is used to give rise to early life latent failures that can be naturally found in a population of devices by accelerating aging processes through Burn-In test phases. This paper provides a methodology to evaluate and compare the stress procedures to be run during Burn-In; the proposed method takes into account several factors such as circuit activity, chip surface temperature and current consumption required by the stress procedure, and also considers Burn-In flow and tester limitations. A specific metric called Stress Coverage is suggested summing up all the stress contributions. Experimental results are gathered on an automotive device, showing the comparison between scan-based and functional stress run by a massively parallelized test equipment; reported figures and tables quantify the differences between the two approaches in terms of stress.

Download Paper (PDF; Only available from the DATE venue WiFi)
18:30End of session
Exhibition Reception in Exhibition Area
The Exhibition Reception will take place on Tuesday in the exhibition area, where free drinks for all conference delegates and exhibition visitors will be offered. All exhibitors are welcome to also provide drinks and snacks for the attendees.