| |
DATE'98 Abstracts
Sessions:
[1A]
[1B]
[1C]
[1D]
[2A]
[2B]
[2C]
[2D]
[3A]
[3B]
[3C]
[3D]
[3E]
[4A]
[4B]
[4C]
[4D]
[5A]
[5B]
[5C]
[5D]
[6A]
[6B]
[6C]
[6D]
[7A]
[7B]
[7C]
[7D]
[8A]
[8B]
[8C]
[8D]
[9A]
[9B]
[9C]
[9D]
[10A]
[10B]
[10C]
[10D]
[11A]
[11B]
[11C]
[11D]
[Poster]
Moderators: Y. Zorian, LogicVision, USA,
P. Plaza, Telefonca I+D, Spain
-
Collapsing the Transistor Chain to an Effective Single Equivalent Transistor [p 2]
- A. Chatzigeorgiou and S. Nikolaidis
The most common practice to model the transistor chain, as it appears in
CMOS gates, is to collapse it to a single equivalent transistor. This method
is analyzed and improvements are presented in this paper. Inherent shortcomings
are removed and an effective transistor width is calculated taking into account
the operating conditions of the structure, resulting in very good agreement
with SPICE simulations. The actual time point when the chain starts conducting
which influences significantly the accuracy of the model is also extracted.
Finally, an algorithm to collapse every possible input pattern to a single
input is presented.
-
Design of Fault-Secure Parity-Prediction Booth Multipliers [p 7]
- M. Nicolaidis and R.O. Duarte
The basic drawback of parity prediction
arithmetic operators is that they may not be fault secure for
single faults. In a recent work we have proposed a theory
for achieving fault secure design for parity prediction
multipliers and dividers. This paper has not considered the
case of Booth multipliers using operand recoding. This case
is analyzed here. Parity prediction logic and fault secure
implementation for this scheme is derived.
Keywords: Self-checking circuits, Booth multipliers
-
PASTEL: A Parameterized Memory Characterization System [p 15]
- K. Ogawa, M. Kohno, and F. Kitamura
PASTEL is a parameterized memory characterization system which extracts the
characteristics of ASIC on-chip-memories such as delay, timing and power
consumption which are important in LSI logic design. PASTEL is a
fully-automated process from exact wire-RC extraction through circuit
reduction, input vector generation, waveform measurement, data-sheet and
library creation. The circuit reduction scheme can reduce the circuit
simulation time by 2 order of magnitude while maintaining delay error within
100pSec of exact simulation.
Moderators: K. Buchenrieder, Siemens AG, Germany,
A. Jerraya, TIMA, Grenoble, France
-
Hardware Resource Allocation for Hardware/Software Partitioning in the LYCOS
System [p 22]
- J. Grode, P.V. Knudsen, and J. Madsen
This paper presents a novel hardware resource allocation
technique for hardware/software partitioning. It allocates
hardware resources to the hardware data-path using
information such as data-dependencies between operations
in the application, and profiling information.
The algorithm is useful as a designer's/designtool's aid
to generate good hardware allocations for use in hard-ware/software
partitioning. The algorithm has been implemented
in a tool under the LYCOS system [9]. The results
show that the allocations produced by the algorithm come
close to the best allocations obtained by exhaustive search.
-
Hardware Software Partitioning with Integrated Hardware Design
Space Exploration [p 28]
- V. Srinivasan, S. Radhakrishnan, and R. Vemuri
This paper presents an integrated approach to hardware software partitioning
and hardware design space exploration. We propose a genetic algorithm which
performs hardware software partitioning on a task graph while simultaneously
contemplating various design alternatives for tasks mapped to hardware. We
primarily deal with data dominated designs typically found in digital
signal processing and image processing applications. A detailed description
of various genetic operators is presented. We provide results to illustrate
the effectiveness of our integrated methodology.
-
Generation of Interconnect Topologies for Communication Synthesis [p 36]
- M. Gasteier, M. Münch, and M. Glesner
One of the key problems in hardware/software co-design is communication
synthesis which determines the amount and type of interconnect between
the hardware components of a digital system. To do so, communication
synthesis derives a communication topology to determine which components
are to be connected to a common communication channel in the final hardware
implementation. In this paper, we present a novel approach to cluster
processes to share a communication channel. An iterative graph-based
clustering algorithm is driven by a heterogeneous cost function which
takes into account bit widths, the probability of access collisions on the
channels, cost for arbitration logic as well as the availability of
interface resources on the hardware components to trade-off cost against
performance in a most optimum fashion. The key aspects of the approach
are demonstrated on a small example.
Moderators: A. Vachoux, Ecole Polytechnique Federale de Lausanne, Switzerland,
T. Kazmierski, University of Southampton, UK
-
The Design of an Asynchronous VHDL Synthesizer [p 44]
- S.-Y. Tan, S.B. Furber, and W.-F. Yen
This paper presents a straightforward approach for
synthesizing a standard VHDL description of an asynchronous
circuit from a behavioural VHDL description. The asynchronous
circuit style is based on `micropipelines', a style currently
used to develop asynchronous microprocessors at Manchester
University. The rules of partition and conversion which are used
to implement the synthesizer are also described. The
synthesizer greatly reduces the design time of a complex
micropipeline circuit.
-
Repartitioning and Technology Mapping of Electronic Hybrid Systems [p 52]
- C. Grimm and K. Waldschmidt
The systematic top-down design of mixed-signal
systems requires an abstract specification of the intended
functions. However, hybrid systems are systems whose parts are
specified using different time models. Specifications of hybrid
systems are not purely functional as they also contain structural
information. The structural information is introduced bypartitioning
the specification into blocks with a homogeneous time model. This
often leads to inefficient implementations. In order to overcome
this problem, a homogeneous representation for behavior of hybrid
systems -- KIR -- is introduced. This representation makes it possible
to represent behavior in all time models in a common
way so that the separation in different modeling styles
is no longer necessary. Rules for re-writing the KIR-graph are
given which permit the description of the
same behaviour in another time model.
-
VHDL-AMS: The Missing Link in System Design -- Experiments with Unified
Modelling in Automotive Engineering [p 59]
- E. Moser and N. Mittwollen
After the IEEE ballot accepted the first draft language
reference manual for VHDL-AMS (IEEE PAR 1076.1) in
October 1997, we now can spend time and effort on applying
the new arising methodology to real world problems
outside the electronic domain. In automotive engineering
we have system design problems dealing with hydraulic
or mechanic components and their controlling units, for
which we expect a major advantage by introducing unified
modelling to all domains.
With the Brite/EuRam-Project TOOLSYS (a joined effort
of automotive industry and tool makers to apply
VHDLAMS as unified modelling language on mixed-domain
applications) we prove the suitability as unified
modelling and interchange language for real-world systems
and components.
First experiments with hydraulic components reveal numerical
problems on analog circuit simulators. None of
the available strategies for these particularly hard problems
are included by the electronic simulator makers. With
VHDLAMS multi-domain modelling seems possible, now
we need multi-domain simulation environments.
Moderators: H.-J. Wunderlich, University of Stuttgart, Germany,
M. Nicolaidis, TIMA, Grenoble, France
-
Scheduling and Module Assignment for Reducing BIST Resources [p 66]
- I. Parulkar, S.K. Gupta, and M.A. Breuer
Built-in self-test (BIST) techniques modify functional
hardware to give a data path the capability
to test itself. The modification of data path registers
into registers (BIST resources) that can generate
pseudo-random test patterns and/or compress test responses,
incurs an area overhead penalty. We show how scheduling and
module assignment in high-level synthesis affect BIST resource
requirements of a data path. A scheduling and module assignment
procedure is presented that produces schedules which, when used to
synthesize data paths, result in a significant reduction
in BIST area overhead and hence total area.
-
An Efficient Algorithm to Integrate Scheduling and Allocation in High-Level
Test Synthesis [p 74]
- T. Yang and Z. Peng
This paper presents a high-level test synthesis algorithm
for operation scheduling and data path allocation. Contrary
to other works in which scheduling and
allocation are performed independently, our approach
integrates these two tasks by performing them simultaneously
so that the effects of scheduling and allocation
on testability are exploited more effectively. The approach
is based on an algorithm which applies a sequence of
semantics-preserving transformations to a design to generate an
efficient RT level implementation from a VHDL behavioral
specification. Experimental results show the advantages of the
proposed algorithm.
-
RAM-Based FPGA's: A Test Approach for the Configurable Logic [p 82]
- M. Renovell, J.M. Portal, J. Figueras, and Y. Zorian
This paper proposes a methodology for testing
the configurable logic of RAM-based FPGAs taking into
account the configurability of such flexible devices. The
methodology is illustrated using the XILINX 4000 family.
On this example of FPGA, we obtain only 8 basic Test
Configurations to fully test the whole matrix of CLBs. In
the proposed Test Configurations, all the CLBs have exactly
the same configuration forming a set of one-dimensional
iterative arrays. The iterative arrays present a C-testability
property in such a way that the number of Test
Configurations 8 is fixed and independent of the FPGA size.
-
Novel Technique for Testing FPGAs [p 89]
- C. Metra, G. Mojoli, S. Pastore, D. Salvi, and G. Sechi
This paper presents a novel technique for testing Field
Programmable Gate Arrays (FPGAs), suitable to be used
in case of frequent FPGA reuse and rapid dynamic modifiability
of the implemented function.
Moderators: Y. Torroja, Polytechnical University of Madrid, Spain,
R. Sarmiento, University of Las Palmas de Gran Canaria, Spain
-
ATM Traffic Shaper: ATS [p 96]
- J.C. Diaz, P. Plaza, and J. Crespo
The design and Implementation of an ATM Traffic
Shaper (ATS) is here described. This IC was realised on a
0.35 m CMOS technology. The main function of the ATS is
the collection of low bit rate traffics to fill a higher bit
rate pipe in order to reduce the cost of ATM based
services, nowadays mainly influenced by transmission
cost. The circuit fits in several ATM system configurations
but mainly will be used at the User-Network Interfaces or
Network-Network interfaces. The IC was designed with a
Top-Down methodology using as HDL, Verilog.
The Chip is pad limited and is encapsulated on a 208
PQFP Package. The circuit complexity is 38 Kgates and
its working frequency is 32Mhz. A circuit prototype was
build with FPGAs in order to validate the RTL
description.
-
XFVHDL: A Tool for the Synthesis of Fuzzy Logic Controllers [p 102]
- E. Lago, C.J. Jiménez, D.R. López,
S. Sánchez-Solano, and A. Barriga
A tool for the synthesis of fuzzy controllers is presented
in this paper. This tool takes as input the behavioral
specification of a controller and generates its VHDL
description according to a target architecture. The VHDL
code can be synthesized by means of two implementation
methodologies, ASIC and FPGA. The main advantages of
using this approach are rapid prototyping, and the use of
well-known commercial design environments like Synopsys,
Mentor Graphics, or Cadence.
-
High Speed Neural Network Chip for Trigger Purposes in High Energy Physics
[p 108]
- W. Eppler, T. Fischer, H. Gemmeke, and A. Menchikov
A novel neural chip SAND (Simple Applicable Neural
Device) is described. It is highly usable for hardware
triggers in particle physics. The chip is optimized for a
high input data rate (50 MHz, 16 bit data) at a very low
cost basis. The performance of a single SAND chip is 200
MOPS due to four parallel 16 bit multipliers and 40 bit
adders working in one clock cycle. The chip is able to
implement feedforward neural networks with a maximum of
512 input neurons and three hidden layers. Kohonen
feature maps and radial basis function networks may be
also calculated. Four chips will be implemented on a PCI-board
for simulation and on a VME board for trigger and
on- and off-line analysis.
Moderators: S.A. Huss, Darmstadt University of Technology, Germany,
H.-P. Amann, University of Neuchatel, Switzerland
-
CASPER: Concurrent Hardware-Software Co-Synthesis of Hard Real-Time
Aperiodic and Periodic Specifications of Embedded System Architectures [p 118]
- B.P. Dave and N.K. Jha
Hardware-software co-synthesis of an embedded system
requires mapping of its specifications into hardware and
software modules such that its real-time and other constraints
are met. Embedded system specifications are generally
represented by acyclic task graphs. Many embedded system
applications are characterized by aperiodic as well as periodic
task graphs. Aperiodic task graphs can arrive for execution at
any time and their resource requirements vary depending on
how their constituent tasks and edges are allocated. Traditional
approaches based on a fixed architecture coupled with slack
stealing and/or on-line determination of how to serve aperiodic
task graphs are not suitable for embedded systems with hard
real-time constraints, since they cannot guarantee that such
constraints would always be met. In this paper, we address the
problem of concurrent co-synthesis of aperiodic and periodic
specifications of embedded systems. We estimate the resource
requirements of aperiodic task graphs and allocate execution
slots on processing elements and communication links for
executing them. Our approach guarantees that the deadlines of
both aperiodic and periodic task graphs are always met. We
have observed that simultaneous consideration of aperiodic task
graphs while performing co-synthesis of periodic task graphs is
vital for achieving superior results compared to the traditional
slack stealing and dynamic scheduling approaches. To the best
of our knowledge, this is the first co-synthesis algorithm which
provides simultaneous support of periodic and aperiodic task
graphs with hard real-time constraints. Application of the
proposed algorithm to several examples from real-life telecom
transport systems shows that up to 28% and 34% system cost
savings are possible over co-synthesis algorithms which employ
slack stealing and rate-monotonic scheduling, respectively.
-
Stream Communication Between Real-Time Tasks in a High-Performance
Multiprocessor [p 125]
- J.A.J. Leijten, J.L. van Meerbergen, A.H. Timmer, and J.A.G. Jess
The demands in terms of processing performance,
communication bandwidth and real-time throughput of
many multimedia applications are much higher than
today's processing architectures can deliver. The PROPHID
heterogeneous multiprocessor architecture template aims
to bridge this gap. The template contains a general purpose
processor connected to a central bus, as well as several
high-performance application domain specific processors.
A high-throughput communication network is used to meet
the high bandwidth requirements between these processors.
In this network multiple time-division-multiplexed data
streams are transferred over several parallel physical
channels. This paper presents a method for guaranteeing
the throughput for hard-real-time streams in such a
network. At compile time sufficient bandwidth is assigned
to these streams. The assignment can be determined in
polynomial time. Remaining bandwidth is assigned to
soft-real-time streams at run time. We thus achieve efficient
stream communication with guaranteed performance.
-
Scheduling of Conditional Process Graphs for the Synthesis of
Embedded Systems [p 132]
- P. Eles, K. Kuchcinski, Z. Peng, A. Doboli, and P. Pop
We present an approach to process scheduling based on an abstract graph
representation which captures both dataflow and the flow of control.
Target are architectures consist of several processors, ASICs and shared
buses. We have developed a heuristic which generates a schedule table so
that the worst delay is minimized. Several experiments demonstrate the
efficiency of the approach.
Moderators: E. Villar, University of Cantabria, Spain,
D. Sciuto, Politecnico di Milano, Italy
-
Model Abstraction for Formal Verification [p 140]
- Y.-W. Hsieh and S.P. Levitan
As the complexity of circuit designs grows, designers look
toward formal verification to achieve better test coverage
for validating complex designs. However, this approach is
inherently computationally intensive,
and hence, only small designs can be verified using this
method. To achieve better performance, model abstraction is
necessary. Model abstraction reduces the number of states
necessary to perform formal verification while maintaining the
functionality of the original model with respect to the
specifications to be verified. As a result, model abstraction
enables large designs to be formally verified. In this paper,
we describe three methods for model abstraction based on semantics
extraction from user models to improve the performance
of formal verification tools.
-
VHDL Modelling and Analysis of Fault Secure Systems [p 148]
- J. Coppens, D. Al-Khalili, and C. Rozon
This paper presents an analysis process targeted for the
verification of fault secure systems during their design
phase. This process deals with a realistic set of microdefects
at the device level which are mapped into mutant
and saboteur based VHDL fault models in the form of logical
and/or performance degradation faults. Automatic
defect injection and simulation are performed through a
VHDL test bench. Extensive post processing analysis is
performed to determine defect coverage, figure of merit for
fault secureness, and MTTF.
-
Register Transfer Level VHDL Models without Clocks [p 153]
- M. Mutz
Several hardware compilers on the market convert
from so-called RT level VHDL subsets to logic level
descriptions. Such models still need clock signals and
the notion of physical time in order to be executable.
In a stage of a top-down design starting from the algorithmic
level, register transfers are considered, where
the timing is not controlled by clock signals and where
physical time is not yet relevant. We propose an executable
VHDL subset for such register transfer models.
-
Parallel VHDL Simulation [p 159]
- E. Naroska
In this paper we evaluate parallel VHDL simulation
based on conservative parallel discrete event simulation
(conservative PDES) algorithms. We focus on a conservative
simulation algorithm based on critical and external
distances. This algorithm exploits the interconnection
structure within the simulation model to increase parallelism.
Further, a general method is introduced to automatically
transform a VHDL model into a PDES model.
Additionally, we suggest a method to further optimize parallel
simulation performance. Finally, our first simulation
results on a IBM parallel computer are presented. While
these results are not sufficient for a general evaluation they
show that a good speedup can be obtained.
Moderators: E. Aas, Norwegian University of Science and Technology, Norway,
Z. Peng, Linköping University, Sweden
-
Testing DSP Cores Based on Self-Test Programs [p 166]
- W. Zhao and C. Papachristou
This paper presents a new method for the testing of the datapath of
DSP cores based on self-test program. During the test, random patterns
are loaded into the core, exercise different components of the core, and
then are loaded out of the core for observation under the control of the
self-test programs. We propose a systematic approach to generate the
self-test program based on two metrics. One is the structured coverage and
the other is the testability metric. Experimental results show the self-test
program obtained by this approach can reach very high fault coverage in
programmable core testing.
-
Self-Adjusting Output Data Compression: An Efficient BIST Technique for RAMs
[p 173]
- V.N. Yarmolik, S. Hellebrand, and H.-J. Wunderlich
After write operations, BIST schemes for RAMs relying
on signature analysis must compress the entire memory
contents to update the reference signature. This paper
introduces a new scheme for output data compression
which avoids this overhead while retaining the benefits of
signature analysis. The proposed technique is based on new memory
characteristic derived as the modulo-2 sum of all addresses pointing
to non-zero cells. This characteristic can be adjusted concurrently
with write operations by simple EXOR-operations on the initial
characteristic and on the addresses affected by the change.
-
Built-In Self-Test with an Alternating Output [p 180]
- T. Bogue, M. Gössel, H. Jürgensen, and Y. Zorian
In this paper, a new compaction technique based on signature
analysis is presented. Rather than comparing the final
signature with the expected one after the test is completed,
the binary output of the MISA is converted into an alternating
binary signal by two simple cover circuits. An error
is indicated whenever the alternation of the output signal is
disturbed. This technique results in a higher fault coverage,
improved fault diagnosis capability, a greater test autonomy
in core-based designs, and early fault notification.
Moderators: I. Bolsens, IMEC, Belgium,
A. Nunez, University of Las Palmas de Gran Canaria, Spain
-
From Algorithms to Hardware Architectures: A Comparison of Regular
and Irregular Structured IDCT Algorithms [p 186]
- C. Schneider, M. Kayss, T. Hollstein, and J. Deicke
The inverse discrete cosine transformation (IDCT) is used
in a variety of decoders (e.g. MPEG). On one hand, highly
optimized algorithms that are characterized by an irregular
structure and a minimum number of operations are
known from software implementations. On the other hand,
regular structured architectures are often used in hardware
realizations. In this paper a comparison of regular
and irregular structured IDCT algorithms for efficient
hardware realization is presented. The irregular structured
algorithms are discussed with main emphasis on assessment
criteria for algorithm selection and high-level synthesis
for hardware cost estimation.
-
Smart Pixel Implementation of a 2-D Parallel Nucleic Wavelet
Transform for Mobile Multimedia Communications [p 191]
- A.M. Rassau, K. Eshraghian, H. Cheung, S.W. Lachowicz, T.C.B. Yu,
W.A. Crossland, and T.D. Wilkinson
A novel Smart Pixel Opto-VLSI architecture to
implement a complete 2-D wavelet transform of real-time
captured images is presented. The Smart Pixel
architecture enables the realisation of a highly parallel,
compact, low power device capable of real-time capture,
compression, decompression and display of images
suitable for Mobile Multimedia Communication
applications.
-
VLSI Architecture for Lossless Compression of Medical Images Using
the Discrete Wavelet Transform [p 196]
- I. Urriza, J.I. Artigas, J.I. García, L.A. Barragán,
and D. Navarro
This paper presents a VLSI Architecture to implement the
forward and inverse 2-D Discrete Wavelet Transform
(FDWT/IDWT), to compress medical images for storage and
retrieval. Lossless compression is usually required in the
medical image field. The word length required for lossless
compression makes too expensive the area cost of the
architectures that appear in the literature. Thus, there is a clear
need for designing an architecture to implement the lossless
compression of medical images using DWT.
The datapath word-length has been selected to ensure the
lossless accuracy criteria leading a high speed implementation
with small chip area. The result is a pipelined architecture that
supports single chip implementation in VLSI technology. The
architecture has been simulated in VHDL and has a hardware
utilization efficiency greater than 99%. It can compute the
FDWT/IDWT at a rate of 3.5 512.512 12 bit images/s
corresponding to a clock speed of 33MHz.
Moderators: R. Ernst, Technical University of Braunschweig, Germany,
P. van der Wolf, Philips Research Laboratories, The Netherlands
-
A Model for System-Level Timed Analysis and Profiling [p 204]
- A. Allara, W. Fornaciari, F. Salice, and D. Sciuto
Fast evaluation of functional and timing properties is becoming a key
factor to enable cost-effective exploration of mixed bw/sw design
alternatives for embedded applications. The goal of this paper is to
present a modeling strategy to specify functionality and timing properties
of uncommitted mixed bw/sw systems. In addition, the paper proposes a
simulation algorithm able to perform fast high-level simulation of the
system by taking into account the initial hw vs sw allocation of system
modules. The related CAD simulation environment allows the designer to
access profiling information which can be useful to remodel the system
to meet the functional/timing goals as well as to drive the following
hw vs sw partitioning activity. Experimental data obtained by reengineering
an industrial design are also included in the paper.
-
Efficient Compilation of Process-Based Concurrent Programs without
Run-Time Scheduling [p 211]
- B. Lin
Currently, run-time operating systems are widely used
to implement concurrent embedded applications. This run-time
approach to multi-tasking and inter-process communication
can introduce significant overhead to execution
times and memory requirements -- prohibitive in many
cases for embedded applications where processor and
memory resources are scarce. In this paper, we present a
static compilation approach that generates ordinary C programs
at compile-time that can be readily retargeted to different
processors, without including or generating a run-time
scheduler. Our method is based on a novel Petri net
theoretic approach.
-
A Macroscopic Time and Cost Estimation Model Allowing Task Parallelism
and Hardware Sharing for the Codesign Partitioning Process [p 218]
- J.A. Maestro, D. Mozos, and H. Mecha
This paper describes a method to estimate the
implementation cost of the hardware part in a mixed
hardware/software system, as well as the related
performance. These estimations try to avoid the use of
many implementation details in order to keep the
complexity order of the process under control. The
concepts of hardware sharing and parallelism are
exploited to make a picture of the whole hardware cost
associated to a given partition.
-
A Scalable Methodology for Cost Estimation in a Transformational
High-Level Design Space Exploration Environment [p 226]
- J. Gerlach and W. Rosenstiel
Objective of the methodology presented in this paper is to
perform design space exploration on a high level of
abstraction by applying high-level transformations. To
realize a design loop which is close and settled on upper
design levels, a high-level estimation step is integrated. In
this paper, several estimation methodologies fixed on different
states of the high-level synthesis process are examined
with respect to their aptitude on controlling the
transformational design space exploration process. Estimation
heuristics for several design characteristics are
derived and experimentally validated.
Moderators: S. Maginot, LEDA, France, W. Ecker, Siemens AG, Germany
-
Object-Oriented Modelling of Parallel Hardware Systems [p 234]
- G. Schumacher and W. Nebel
Object-oriented techniques like inheritance promise
great benefits for the specification and design of parallel
hardware systems. The difficulties which arise from the use
of inheritance in parallel hardware systems are analysed
in this article. Similar difficulties are well known in con-current
object-oriented programming as inheritance
anomaly but are not yet investigated in object-oriented
hardware design. A solution how to successfully deal with
the anomaly is presented for a type based object-oriented
extension to VHDL. Its basic idea is to separate the synchronisation
code (protocol specification) and the actual
behaviour of a method. Method guards which allow a
method to execute if a guard expression evaluates to true
are proposed to model synchronisation constraints. It is
shown how to implement a suitable re-schedule mechanism
for methods as part of the synchronisation code to
handle the case that a guard expression is evaluated to
false.
-
A Flexible Message Passing Mechanism for Objective VHDL [p 242]
- W. Putzke-Röming, M. Radetzki, and W. Nebel
When defining an object-oriented extension to VHDL, the
necessary message passing is one of the most complex issues
and has a large impact on the whole language. This
paper identifies the requirements for message passing suited
to model hardware and classifies different approaches.
To allow abstract communication and reuse of protocols on
system level, a new, flexible message passing mechanism
proposed for Objective VHDL 1 will be introduced.
-
Enhanced Reuse and Teamwork Capabilities for an Object-Oriented Extension of
VHDL [p 250]
- M. Mrva
This paper presents a proposal for enabling VHDL to
better support reuse and collaboration. Base idea is passing
on the adequate information to partners working in an
object-oriented hardware design environment. Appropriate
subgoals for achieving this are:
- an optimal mix of necessary abstraction and
sufficient precision,
- a formal description consisting of implementation
constraints and knowledge requirements, and
- the non-formal concept of mutual consideration.
Several loans are made from
- the software domain: Java interfaces, type models,
and the request for habitability,
- the VHDL Annotation Language.
This is not an experience report, for the idea of
adopting the mentioned software concepts to hardware
design is new. It is rather a guided tour to some 'panorama
views'. Although they may not seem related to each other
at first glance, they turn out to altogether support a
common goal: understanding and communicating VHDL-based
designs better.
-
Formal Specification in VHDL for Hardware Verification [p 257]
- R. Reetz, K. Schneider, and T. Kropf
In this paper, we enrich VHDL with new specification constructs intended
for hardware verification. Using our extensions, total correctness properties
may now be stated whereas only partial correctness can be expressed using
the standard VHDL assert statement. All relevant properties can now be
specified in such a way that the designer does not need to use formalisms
like temporal logics. As the specifications are independent from a certain
formalism, there is no restriction to a certain hardware verification approach.
Moderators: T. Vierhaus, Technical University of Cottbus, Germany,
R. Segers, Philips Semiconductors, The Netherlands
-
A Low-Redundancy Approach to Semi-Concurrent Error Detection in Data Paths [p 266]
- A. Antola, V. Piuri, and M. Sami
A high-level synthesis approach is proposed for design of
semi-concurrently self-checking devices; attention is
focussed on data path design. After identifying the
reference architecture against which cost and
performances should be evaluated, a simultaneous
scheduling-and-allocation algorithm is presented,
allowing resource sharing between nominal and checking
data paths. The algorithm grants that the required
checking periodicity is satisfied while minimizing
additional costs in terms of functional units. Risk of error
aliasing due to resource sharing is analysed.
-
Measuring the Effectiveness of Various Design Validation Approaches
for PowerPCTM Microprocessor Arrays [p 273]
- L.-C. Wang, M.S. Abadir, and J. Zeng
Design validation for embedded arrays remains as a
challenging problem in today's microprocessor design environment.
At Somerset, validation of array designs relies on both formal
verification and vector simulation. Although several methods for
array design validation have
been proposed and had great success [[6], [9], [10], [13]],
little evidence has been reported for the effectiveness of
these methods with respect to the detection of design errors. In
this paper, we propose a new way of measuring
the effectiveness of different validation approaches based
on automatic design error injection and simulation. This
technique provides a systematic way for the evaluation of
the quality of various validation approaches. Experimental
results using different validation approaches on recent
PowerPC microprocessor arrays will be reported.
-
Functional Scan Chain Testing [p 278]
- D. Chang, M.T.-C. Lee, K.-T. Cheng, and M. Marek-Sadowska
Functional scan chains are scan chains that have scan
paths through a circuit's functional logic and flip-flops.
Establishing functional scan paths by test point insertion
(TPI) has been shown to be an effective technique to reduce
the scan overhead. However once the scan chain is allowed
to go through functional logic, the traditional alternating
test sequence is no longer enough to ensure the correctness
of the scan chain. We identify the faults that affect the
functional scan chain, and show a methodology to find
tests for these faults. Our results have the number of
undetected faults at only 0.006% of the total number of
faults, or 0.022% of the faults affecting the scan chain.
Co-ordinators: Carlo Guardiani, SGS-Thomson, Italy
Wolfgang Nebel, Oldenburg University and OFFIS, Germany
Moderator: Alberto Sangiovanni-Vincentelli, University of California at Berkeley, USA
Speakers: Grant Martin, Cadence, USA
Mike Muller, ARM, UK
Bart De Loore, Philips Semiconductors, The Netherlands
Panelists: Doug Fairbairn, VSI Alliance, USA
Pietro Erratico, SGS-Thomson, Italy
Faysal Soheil, Synopsys, USA
-
Design Methodologies for System Level IP [p 286]
- G. Martin
System-chip design which starts at the RTL-level today
has hit a plateau of productivity and re-use which can be
characterised as a 'Silicon Ceiling'. Breaking through this
plateau and moving to higher and more effective re-use of
IP blocks and system-chip architectures demands a move to
a new methodology: one in which the best aspects of
today's RTL based methods are retained, but complemented
by new levels of abstraction and the commensurate tools to
allow designers to exploit the productivity inherent in these
higher levels of abstraction. In addition, the need to
quickly develop design derivatives, and to differentiate
products based on standards, requires an increasing use of
software IP. This paper will describe today's situation, the
requirements to move beyond it, and sketch the outlines of
near-term possible and practical solutions.
-
IP-Based System-on-a-Chip Design [p 290]
- B. De Loore
In the era of IP reuse, what is going to make the
difference between different system-on-a-chip
providers? To answer this question, it suffices to depict the
competencies required to be a successful silicon
provider. We distinguish technical, organizational and
To be successful, a system-on-a-chip provider will have
to be excellent in:
Selecting the right product
Implementing the product with a right mix of
design time, cost, dissipation
Delivery performance and customer support
Moderators: J. Heaton, ICL, UK, R. Seepold, FZI Karlsruhe, Germany
-
A Systematic Analysis of Reuse Strategies for Design of
Electronic Circuits [p 292]
- M. Koegst, P. Conradi, D. Garte, and M. Wahl
In this paper a number of reuse approaches for circuit
design are analysed. Based on this analysis an algebraic
core model for discussion of a general reuse strategy is
proposed. Using this model, the aim is to classify different
reuse approaches for circuit design, to compare the
applied terms and definitions, and to formulate classes of
typical reuse tasks. In a practical application with focus
on retrieval and parameterisation techniques, this model
is on the way to being applied to DSP design issues.
-
VHDL Teamwork, Organization Units and Workspace Management [p 297]
- S. Olcoz, L. Ayuda, I. Izaguirre, and O. Penalba
A new set of tools for Teamwork, Organization Units,
Workspace and Build management of VHDL-based
reusable components, organized in libraries, accessible
through an heterogeneous and distributed environment is
presented. These tools support the collaborative and
distributed development of systems-on-a-chip reusing
VHDL components available through the intranets and the
Internet. They must be used as a complementary support to
the design tools (simulation, synthesis, etc.) already
available in the market to enhance productivity, facilitating
maintenance, improving reliability, efficiency and
interoperability, and finally, capitalizing on the IP library
components investment.
-
An Object-Oriented Model for Specification, Prototyping, Implementation
and Reuse [p 303]
- J. Böttger, K. Agsteiner, D. Monjau, and S. Schulze
This paper presents a hierarchical, object-oriented model
as a basis for reuse of components in the design process of
digital systems.
The model forms a uniform knowledge base which consists
of formal descriptions about functional, qualitative,
and quantitative properties of systems and components. It
supports the synthesis of systems from the described components.
Starting at a system specification different models
and descriptions are generated for simulation, prototyping,
analysis and high level synthesis.
Moderators: E. Barke, University of Hannover, Germany,
I. Rugen-Herzig, Temic Telefunken Microelectronic GmbH, Germany
-
A Flat, Timing-Driven Design System for a High-Performance CMOS
Processor Chipset [p 312]
- J. Koehl, U. Baur, T. Ludwig, B. Kick, and T. Pflueger
We describe the methodology used for the design of the
CMOS processor chipset used in the IBM S/390 Parallel
Enterprise Server - Generation 3. The majority of the logic is
implemented by standard cell elements placed and routed
flat, using timing-driven techniques. The result is a globally
optimized solution without artificial floorplan boundaries.
We will show that the density in terms of transistors per mm 2
is comparable to the most advanced custom designs and that
the impact of interconnect delay on the cycle time is very
small. Compared to custom design, this approach offers
excellent turn-around-time and considerably reduces overall
effort.
-
Algorithms for Detailed Placement of Standard Cells [p 321]
- J. Vygen
The state-of-the-art methods for the placement of
large-scale standard cell designs work in a top-down
fashion. After some iterations, where more and more
detailed placement information is obtained, a final
procedure for finding a legal placement is needed. This
paper presents a new method for this final task, based
on efficient algorithms from combinatorial optimization.
-
Timing Analysis and Optimization of a High-Performance CMOS
Processor Chipset [p 325]
- U. Fassnacht and J. Schietke
We describe the timing analysis and optimization
methodology used for the chipset inside the IBM S/390
Parallel Enterprise Server - Generation 3. After an
introduction to the concepts of static timing analysis, we
describe the timing-modeling for the gates and
interconnects, explain the optimization schemes and
present obtained results.
-
A Sequential Detailed Router for Huge Grid Graphs [p 332]
- A. Hetzel
Sequential routing algorithms using maze-running
are very suitable for general Over-the-Cell-Routing
but suffer often from the high memory or runtime requirements
of the underlying path search routine. A
new algorithm for this subproblem is presented that
computes shortest paths in a rectangular grid with respect
to euclidean distance. It achieves performance
and memory requirements similar to fast line-search algorithms
while still being optimal. An additional application for the
computation of minimal rip-up sets will be presented. Computational
results are shown for a detailed router based on these algorithms
that is used for the design of high performance CMOS processors
at IBM.
Co-ordinator: Ivo Bolsens, IMEC, Belgium
Moderator: Nadir Bagherzadeh, University of California at Irvine, USA
Speakers: W. Shields Neely, National Semiconductor, USA
Jan Rabaey, University of California at Berkeley, USA
Ian Page, University of Oxford, UK
-
Reconfigurable Logic for Systems on a Chip [p 340]
- W. Shields Neely
The electronic systems of the future will be
implemented in terms of multi-million gate 'systems
on a chip'. These systems will require an enormous
investment in design and manufacturing; yet the pace
of technological change (e.g., new algorithm
development, new processor and memory designs) and
ever changing requirements puts them in danger of
obsolescence soon after they are created -- applications
always want to take advantage of new technical
advances and must meet changed requirements. What
is needed are single chip systems that are designed to
be adaptable to a family of applications. The
emerging technology of configurable logic offers the
promise of large-scale silicon systems that are
adaptive after manufacture, with little or no sacrifice
in execution efficiency compared to hard-wired
systems.
-
An Energy-Conscious Exploration Methodology for Reconfigurable DSPs [p 341]
- J. Rabaey and M. Wan
As the 'system-on-a-chip' concept is rapidly becoming a
reality, time-to-market and product complexity push the
reuse of complex macromodules. Circuits combining a variety
of the macromodules (micro-processors, DSPs, programmable
logic and embedded memories) are being reported by
number of companies [2]. Most of these systems target the
embedded market where speed, area, and power requirements
are paramount, and a balance between hardware and
software implementation is needed. Reconfigurable computing
devices have recently emerged as one of the major alternative
implementation approaches, addressing most of the
requirements outlined above.
-
Design of Future Systems [p 343]
- I. Page
This paper describes a vision in which future systems
consisting of novel hardware and software components
are designed and implemented by a single type
of professional engineer. That professional has more
in common with today's programmer than a hardware
designer, although both of these existing bodies of pro-fessionals
have a strong contribution to make to understanding, defining and
bringing about this transformation in product creation.
Moderators: Peter Schwarz, Fraunhofer EAS Dresden, Germany,
H. Fleurkens, Philips Research Laboratories, The Netherlands
-
AFTA: A Formal Delay Model for Functional Timing Analysis [p 350]
- V. Chandramouli, J.P. Whittemore, and K.A. Sakallah
Despite its importance, we find that a rigorous theoretical
foundation for performing timing analysis has been lacking
so far. As a result, we have initiated a research project that
aims to provide such a foundation for functional timing
analysis. As part of this work we have developed an
abstract automaton based delay model that accounts for
the various analog factors affecting delay, such as signals
slopes, near simultaneous switching, etc., while at the same
time accounting for circuit functionality. This paper presents
this delay model.
-
Power-Simulation of Cell Based ASICs: Accuracy- and Performance Trade-Offs
[p 356]
- D. Rabe, G. Jochens, L. Kruse, and W. Nebel
Within this paper the gate-level power-simulation tool
GliPS (Glitch Power Simulator) is presented, which gives
excellent accuracy (in the range of transistor-level simulators)
at high performance. The high accuracy is achieved
by putting emphasis on delay- and power-modelling. The
impact of these modelling factors on accuracy and performance
is demonstrated by comparing GliPS to other
tools on circuit-level and a simple toggle count based
power simulator TPS on gate level.
-
Advanced Optimistic Approaches in Logic Simulation [p 362]
- S. Schmerler, Y. Tanurhan, and K.D. Müller-Glaser
This paper presents the optimistic synchronization mechanism Predictive
Time Warp (PTW) based on the implementation Time Warp of the Virtual
Time paradigma for use in the simulation of electronic systems and high
level system simulation. In comparison to most existing
approaches extending and improving classical Time
Warp, the aim of this development was to reduce the rollback frequency
of optimistic logical processes without imposing waiting periods. Part
of PTW is the introduction of forecast events predicting a certain
period in the future and thus reduce the rollback probability. On the example
of a distributed logic simulation the benefit of the PTW
synchronization approach is shown.
Moderators: F. Kurdahi, University of California, Irvine, USA,
A. Jerraya, TIMA, Grenoble, France
-
PSCP: A Scalable Parallel ASIP Architecture for Reactive Systems [p 370]
- A. Pyttel, A. Sedlmeier, and C. Veith
We describe a Codesign approach based on a parallel
and scalable ASIP architecture, which is suitable for the
implementation of reactive systems. The specification language
of our approach is extended statecharts. Our ASIP
architecture is scalable with respect to the number of processing
elements as well as parameters such as bus widths
and register file sizes. Instruction sets are generated from
a library of components covering a spectrum of space/time
trade-off alternatives. Our approach features a heuristic
static timing analysis step for statecharts. An industrial example
requiring the real-time control of several stepper
motors illustrates the benefits of our approach.
-
A Constraint Driven Approach to Loop Pipelining and Register Binding [p 377]
- B. Mesman, M. Strik, A.H. Timmer, J.L. van Meerbergen, and
J.A.G.Jess
Code generation methods for DSP applications are
hampered by the combination of tight timing constraints
imposed by the performance requirements of DSP
algorithms, and resource constraints imposed by a
hardware architecture. In this paper , we present a
method for register binding and instruction scheduling
based on the exploitation and analysis of resource and
timing constraints. The analysis identifies sequencing
constraints between operations additional to the precedence
constraints. Without the explicit modeling of these
sequencing constraints, a scheduler is often not capable
of finding a solution that satisfies the timing , resource
and register constraints. The presented approach results
in an efficient method of obtaining high quality
instruction schedules with low register requirements.
-
Multiple Behavior Module Synthesis Based on Selective Groupings [p 384]
- J.-H. Yi, H. Choi, I.-C. Park, S.H. Hwang, and C.-M. Kyung
In this paper, we present an approach to synthesize multiple
behavior modules. Given n DFGs to be implemented, the previous
methods scheduled each of them sequentially, and implemented
them as a single module. Though the method is appropriate
for sharing the functional units, it ignored the following
two aspects: 1) different interconnection patterns among DFGs
can increase the interconnection area and delay of the critical
path, 2) the sequential scheduling of DFGs has a difficulty in
considering the effects on the other DFGs not scheduled yet. We
show an efficient way to solve the problems using a selective
grouping method and the extensions of the traditional scheduling
methods. The experimentation reveals that the result obtained
by the proposed method is better to reduce interconnection
area and to meet the timing constraints than those obtained
by the previous methods.
-
Optimal Temporal Partitioning and Synthesis for Reconfigurable Architectures
[p 389]
- M. Kaul and R. Vemuri
We develop a 0-1 non-linear programming (NLP) model for combined temporal
partitioning and high-level synthesis from behavioral specifications
destined to be implemented on reconfigurable processors. We present
tight linearizations of the NLP model. We present effective variable
selection heuristics for a branch and bound solution of the derived
linear programming model. We show how tight linearizations combined
with good variable selection techniques during branch and bound yield
optimal results in relatively short execution times.
Moderators: M.D.F. Wong, University of Texas at Austin, USA,
F.M. Johannes, Technical University of Munich, Germany
-
An Effective General Connectivity Concept for Clustering [p 398]
- J. Song, Z. Shen, and W. Zhuang
This paper shows how algorithmic techniques and parallel processing can speed
up general connectivity computation. A new algorithm, called Concurrent
Group Search Algorithm (CGSA), is proposed that divides N(N-1)/2 vertex
pairs into N-1 groups. Within each group general connectivities of all pairs
can be calculated concurrently. Our experimental results show that this
technique can achieve speedup of 12 times for one circuit. In addition,
group computations are parallelized on a 16-node IBM SP2 with a speedup of
14 times over its serial counterpart observed. Combining the two approaches
could result in a total speedup of up to 170 times, reducing CPU time from
over 200 hours to 1.2 hour for one circuit. Our new model is better than
those without clustering because it characterizes the connection graph more
accurately, is faster to compute and produces better results. The best
performance improvements are 43% for one circuit and 49% for another.
-
Improved Approximation Bounds for the Group Steiner Problem [p 406]
- C.S. Helvig, G. Robins, and A. Zelikovsky
Given a weighted graph and a family of k disjoint
groups of nodes, the Group Steiner Problem asks for
a minimum-cost routing tree that contains at least
one node from each group. We give polynomial-time
O(ke)-approximation algorithms for arbitrarily small
values of e > 0, improving on the previously known
O(k1/2)-approximation. Our techniques also solve the
graph Steiner arborescence problem with an O(k) approximation
bound. These results are directly applicable to a practical
problem in VLSI layout, namely the routing of nets with
multi-port terminals. Our Java implementation is available
on the Web.
-
An Interactive Router for Analog IC Design [p 414]
- T. Adler and J. Scheible
We present an interactive two layer router integrated
in an analog IC design environment used in an SDL
(schematic driven layout) design flow. Special features are
its customizability, the treatment of arbitrary polygons and
an advanced handling of source/target polygons in order
to avoid net internal design rule violations during
connection phase.
A global routing algorithm is used to split the route
into separate parts each routable in a single layer. After
via placement a specialized maze router performs the
advanced single layer routes in 90 or 45 degree mode. The
resulting route can be modified by interactive via movement
and rerouting of obsolete partial routes.
Organizers: Wolfgang Rosenstiel, University of Tübingen, Germany
Gerry Musgrave, Brunel University, UK
Moderator: Gerry Musgrave, Brunel University, UK
Panelists: Dominique Borrione, TIMA-UJF, France
Antun Domic, Synopsys, USA
Ramayya Kumar, Verysys, Germany
Alan Page, Abstract Design Automation, UK
Michael Payer, Siemens, Germany
-
Formal Verification: A New Standard CAD Tool for the Industrial Design Flow
[p 422]
- W. Rosenstiel
Formal verification has been the province of academic research for many years.
More recently tools have become available from vendors to tackle some aspects
of the design verification problems. There have been considerable learning
scenarios in order to understand how this technique can fit in the real
industrial design flow. The Panel, consisting of academics, vendors and users,
will endeavour to clarify what these tools can do, what their potential will be
and the experiences to date in helping validate today's complex designs.
Moderators: J. Forrest, UMIST, Manchester, UK
M. Pfaff, Johannes Kepler University Linz, Austria
-
A System-Level Co-Verification Environment for ATM Hardware Design [p 424]
- G. Post, A. Müller, and T. Grötker
Common approaches to hardware implementation of
networking components start at the VHDL level and are
based on the creation of regression test benches to perform
simulative validation of functionality. The time needed to
develop test benches has proven to be a significant bottle-neck
with respect to time-to-market requirements. In this
paper, we describe the coupling of a telecommunication
network simulator with a VHDL simulator and a hardware
test board. This co-verification approach enables the designer
of hardware for networking components to verify
the functional correctness of a device under test against
the corresponding algorithmic description and to perform
functional chip verification by reusing test benches from a
higher level of abstraction.
-
FRIDGE: A Fixed-Point Design and Simulation Environment [p 429]
- H. Keding, M. Willems, M. Coors, and H. Meyr
Digital systems, especially those for mobile applications
are sensitive to power consumption, chip size and
costs. Therefore they are realized using fixed-point architectures,
either dedicated HW or programmable DSPs. On
the other hand, system design starts from a floating-point
description. These requirements have been the motivation
for FRIDGE 1 , a design environment for the specification,
evaluation and implementation of fixed-point systems.
FRIDGE offers a seamless design flow from a floating-point
description to a fixed-point implementation. Within
this paper we focus on two core capabilities of FRIDGE:
(1) the concept of an interactive, automated transformation
of floating-point programs written in ANSIC into
fixed-point specifications, based on an interpolative approach.
The design time reductions that can be achieved
make FRIDGE a key component for an efficient HW/SW-CoDesign.
(2) a fast fixed-point simulation that performs comprehensive
compile-time analyses, reducing simulation time
by one order of magnitude compared to existing approaches.
-
Verification by Simulation Comparison Using Interface Synthesis [p 436]
- C. Hansen, A. Kunzmann, and W. Rosenstiel
One of the main tasks within the high-level synthesis (HLS)
process is the verification problem to prove automatically
the correctness of the synthesis results. Currently, the results
are usually checked by simulation. In consequence,
both the behavioral specification and the HLS results have
to be simulated by the same set of test vectors. Due to the
HLS and the inherent changes in the cycle-by-cycle behaviour,
the synthesis results require an adaption of the initial
test vector set. This reduces the advantage gained by
using the automated HLS process. In order to decrease
these simulation efforts, in this paper a new method will be
presented that enables the usage of the same simulation
vectors at both abstraction levels and the execution of an
automated simulation comparison.
Moderators: P. Marwedel, University of Dortmund, Germany,
A. Timmer, Philips Research Laboratories, The Netherlands
-
Layout-Driven High Level Synthesis for FPGA Based Architectures [p 446]
- M. Xu and F.J. Kurdahi
In this paper, we address the problem of layout-driven
scheduling-binding as these steps have a direct relevance
on the final performance of the design. The importance
of effective and efficient accounting of layout effects is well-established
in High-Level Synthesis (HLS), since it allows
more efficient exploration of the design space and the generation
of solutions with predictable metrics. This feature is highly
desirable in order to avoid unnecessary iterations through the
design process.By producing not only an RTL netlist but also
an approximate physical topology of implementation at the chip
level, we ensure that the solution will perform at the predicted
metric once implemented, thus avoiding unnecessary delays in the
design process.
-
Cross-Level Hierarchical High-Level Synthesis [p 451]
- O. Bringmann and W. Rosenstiel
This paper presents a new approach to cross-level hierarchical
high-level synthesis. A methodology is presented,
that supports the efficient synthesis of hierarchical specified
systems while preserving the hierarchical structure. After
synthesis of each subsystem the determined component
schedule and the synthesized RT-structure are added to its
algorithmic specification. This provides an automatic selection
of optimized complex components. Furthermore, the
component schedule enables the sharing of unused subcomponents
across different hierarchical levels of the design.
-
An Algorithm to Determine Mutually Exclusive Operations in Behavioral
Descriptions [p 457]
- J. Li and R.K. Gupta
Scheduling and binding are two major tasks in architectural
synthesis from behavioral descriptions. The information about
the mutually exclusive pairs of operations is very useful in
reducing both the total delay of the schedule and the resource
usage in the final circuit implementation. In this paper, we
present an algorithm to identify the largest set of mutually
exclusive operation pairs in behavioral descriptions. Our algorithm
uses dataflow analysis on a tabular model of system functionality,
and is shown to work better than the existing methods for identifying
mutually exclusive operations.
Moderators: R. Peset Llopis, Philips Research Laboratories, The Netherlands,
B. Schürmann, University of Kaiserslautern, Germany
-
A Performance-Driven MCM Router with Special Consideration of
Crosstalk Reduction [p 466]
- D. Wang and E.S. Kuh
This paper presents a new performance-driven
MCM router, named MRC, with special consideration
of crosstalk reduction. Router MRC completes an initial
routing with an adequate performance trade-off including
wire length, vias, number of layers, timing and
crosstalk. Then a crosstalk reduction algorithm is used
to make the routing solution crosstalk-free without big
influence on other routing performances. Thus, efficiently
handling timing and crosstalk problems becomes the unique
feature of MRC. Router MRC has been implemented and tested
on MCM benchmarks and the experimental results are very
promising.
-
Interconnect Tuning Strategies for High-Performance ICs [p 471]
- A.B. Kahng, S. Muddu, E. Sarto, and R. Sharma
Interconnect tuning is an increasingly critical degree of freedom in the
physical design of high-performance VLSI systems. By interconnect
tuning, we refer to the selection of line thicknesses, widths and spacings
in multi-layer interconnect to simultaneously optimize signal distribution,
signal performance, signal integrity, and interconnect manufacturability
and reliability. This is a key activity in most leading-edge
design projects, but has received little attention in the literature. Our
work provides the first technology-specific studies of interconnect tuning
in the literature. We center on global wiring layers and interconnect
tuning issues related to bus routing, repeater insertion, and choice
of shielding/spacing rules for signal integrity and performance. We address
four basic questions. (1) How should width and spacing be allocated
to maximize performance for a given line pitch? (2) For a given
line pitch, what criteria affect the optimal interval at which repeaters
should be inserted into global interconnects? (3) Under what circumstances
are shield wires the optimum technique for improving interconnect
performance? (4) In global interconnect with repeaters, what other
interconnect tuning is possible? Our study of question (4) demonstrates
a new approach of offsetting repeater placements that can reduce worst-case
cross-chip delays by over 30% in current technologies.
-
A Polynomial Time Optimal Algorithm for Simultaneous Buffer
and Wire Sizing [p 479]
- C.C.N. Chu and D.F. Wong
An interconnect joining a source and a sink is divided
into fixed-length uniform-width wire segments, and some
adjacent segments have buffers in between. The problem
we considered is to simultaneously size the buffers and
the segments so that the Elmore delay from the source to
the sink is minimized. Previously, no polynomial time algorithm
for the problem has been reported in literature.
In this paper, we present a polynomial time algorithm
SBWS for the simultaneous buffer and wire sizing problem.
SBWS is an iterative algorithm with guaranteed
convergence to the optimal solution. It runs in quadratic
time and uses constant memory for computation. Also,
experimental results show that SBWS is extremely efficient
in practice. For example, for an interconnect of
10000 segments and buffers, the CPU time is only 0.127
second.
Co-ordinators: Wolfgang Rosenstiel, University of Tübingen, Germany
Joachim Kunkel, Synopsys, USA
Moderator: Joachim Kunkel, Synopsys, USA
Panelists: Misha Burich, Cadance/Alta, USA
Raul Camposano, Synopsys, USA
Mark Genoe, Alcatel, Belgium
Lev Markov, Mentor Graphics, USA
Steve Schulz, Texas Instruments, USA
-
Next Generation System Level Design Tools [p 488]
- W. Rosenstiel
This panel discusses the requirements for the next generation system design tools and presents the
latest developments from the industrial leaders. Attendees are representatives from system houses as
well as from the electronic system design automation companies.
The panel is chaired by Joachim Kunkel, Director Engineering for System Level Design Tools at
Synopsys.
The electronic system design companies are represented by Misha Burich, VP Engineering from the
Alta Group of Cadence, Raul Camposano, Senior VP and General Manager for the Design Tools
Group of Synopsys and Lev Markov, Chief Scientist for system level co-design of Mentor Graphics.
Marc Genoe, Chairman of the System Level Design and Verification Working Group of the Virtual
Socket Interface Alliance will discuss the standardization process with respect of system level design.
Steve Schulz, Texas Instruments, the initiator of the System Level Design Language initiative, will
present the status of this recent development. In addition, system house representatives will discuss
future requirements for system level design tools.
Moderators: M. Sachdev, Philips Research Laboratories, The Netherlands,
B. Straube, FhG IIS/EAS Dresden, Germany
-
Estimation of the Defective IDDQ Caused by Shorts in
Deep-Submicron CMOS ICs [p 490]
- R. Rodríguez-Montanés and J. Figueras
The defective IDDQ in deep-submicron full complementary
MOS circuits with shorts is estimated. High
performance and also low power scenarios are considered.
The technology scaling, including geometry
reductions of the transistor dimensions, power supply
voltage reduction, carrier mobility degradation and velocity
saturation, is modeled. By means of the characterization of
the saturation current of a simple MOSFET, a lower bound of
IDDQ defective consumption versus Leff is found. Quiescent
current consumption lower bound for shorts intragate, and
shorts intergate affecting at least one logic node is evaluated.
The methodology is used to estimate the IDDQ distribution, for a
given input vector, of defective circuits. This IDDQ estimation
allows the determination of the threshold value to be used for the
faulty/fault-free circuit classification.
-
A Fully Digital Controlled Off-Chip IDDQ Measurement Unit [p 495]
- B. Straka, H. Manhaeve, J. Vanneuville, and M. Svajda
This paper describes a new Digital controlled Cjf-Chip IDDQ
Measurement Unit (DOCIMU), which provides reliable precision and relatively
fast measurements, even with a high capacivity load, while the Device Under
Test (DUT) is unaffected. The maximal resolution is 50nA and the accurate
measurement range is 1mA. Unlike other IDDQ monitors, the
DOCIMY copes with external interference, as it needs no analogue pin to
set the IDDQ limit and the noise at the VDD is eliminated
via a special S/H feature. The DOCIMU is also a testable IDDQ
monitor, which is another unique feature.
-
March Tests for Word-Oriented Memories [p 501]
- A.J. van de Goor and I.B.S. Tlili
Most memory test algorithms are optimized tests for a
particular memory technology and a particular set of
fault models, under the assumption that the memory is
bit-oriented; i.e., read and write operations affect only
a single bit in the memory. Traditionally, word-oriented
memories have been tested by repeated application of a
test for bit-oriented memories whereby a different data
background (which depends on the used intra-word fault
model) is used during each iteration. This results in time
inefficiencies and limited fault coverage.
A new approach for testing word-oriented memories
is presented, distinguishing between inter-word and
intra-word faults and allowing for a systematic way of
converting tests for bit-oriented memories to tests for
word-oriented memories. The conversion consists of
concatenating the bit-oriented test for inter-word faults
with a test for intra-word faults. This approach results in
more efficient tests with complete coverage of the targeted
faults. Because most memories have an external data path
which is wider than one bit, word-oriented memory tests
are very important.
Moderators: J. Bausells, CNM, Barcelona, Spain,
M. Glesner, Technical University of Darmstadt, Germany
-
A Modeling Approach to Include Mechanical Microsystem Components
into the System Simulation [p 510]
- R. Neul, U. Becker, G. Lorenz, P. Schwarz, J. Haase, and
S. Wünsche
For MEMS devices modern technologies are used to
integrate very complex components and subsystems
closely together. Due to mixed-domain problems as well
as the occuring interactions between the closely coupled
system components the design is a sophisticated process.
The interactions between the MEMS components have to
be analysed by system simulation already in an early
design stage. In this paper a modeling approach is introduced
that enables the incorporation of mechanical
microsystem components into the system simulation using
network and system simulators like SABER. The
approach is based on multi-terminal models of basic
mechanical elements and their composition to more complex
microsystems. First results for a micromechanical
resonator are presented.
-
Fast Field Solvers for Thermal and Electrostatic Analysis [p 518]
- V. Székely and M. Rencz
Two different field solver tools have been developed
in order to facilitate fast thermal and electrostatic
simulation of microsystem elements. The mS-THERMANAL
program is capable for the fast steady-state
and dynamic simulation of suspended multilayered
microsystem structures. The 2D-SUNRED program is the
first version of a general field solver based on an original
method, the successive node reduction. SUNRED offers a
very fast and accurate substitute of FEM programs for the
solution of the Poisson equation. Steady-state and
dynamic simulation examples demonstrate the usability of
the novel tool.
-
Microsystems Testing: An Approach and Open Problems [p 524]
- M. Lubaszewski, E.F. Cota, and B. Courtois
In this work a Computer-Aided Testing (CAT) tool is
proposed that brings a systematic way of dealing with
testing problems in emerging microsystems. Experiments
with case-studies illustrate the techniques and tools
embedded in the CAT environment. Some of the open
problems that shall be addressed in the near future as an
extension to this work are also discussed.
Moderators: F.M. Johannes, Technical University of Munich, Germany,
J. Koehl, IBM Deutschland Entwicklung GmbH, Germany
-
Reduced-Order Modeling of Large Linear Passive Multi-Terminal Circuits
Using Matrix-Padé Approximation [p 530]
- R.W. Freund and P. Feldmann
This paper introduces SyMPVL, an algorithm for the approximation of the
symmetric multi-port transfer function of an RLC circuit. The algorithm
employs a symmetric block-Lanczos algorithm to reduce the original circuit
matrices to a pair of typically much smaller, banded, symmetric matrices.
These matrices determine a matrix-Padé approximation of the
multi-port transfer function, and can serve as a reduced-order model of
the original circuit. They can be "stamped" directly into the Jacobian
matrix of a SPICE-type circuit simulator, or can be used to synthesize an
equivalent smaller circuit. We also prove stability and passivity of the
reduced-order models in the RL, RC, and LC special cases, and report
numerical results for SyMPVL applied to example circuits.
-
An Efficient Algorithm for Fast Parasitic Extraction and Passive
Order Reduction of 3D Interconnect Models [p 538]
- N. Marques, M. Kamon, J. White, and L.M. Silveira
As VLSI circuit speeds have increased, the need for accurate
three-dimensional interconnect models has become
essential to accurate chip and system design. In this paper,
we describe an integral equation approach to modeling the
impedance of interconnect structures accounting for both
the charge accumulation on the surface of conductors and
the current traveling along conductors. Unlike previous
methods, our approach is based on a modified nodal analysis
formulation and can be used directly to generate guaranteed
passive low order interconnect models for efficient
inclusion in a standard circuit simulator.
-
MCM Interconnect Design Using Two-Pole Approximation [p 544]
- J. Shao and R.M.M. Chen
In this paper, an optimization scheme is proposed for
interconnect design with wire width and series resistance
being design variables. Due to the distributed nature of
interconnects, poles of such systems are transcendental and
infinite in number. First, a two-pole approximation is used
to capture the system behavior. Lower-order moments are
employed to obtain two approximate dominant poles. Then,
the two parameters, damping ratio and natural undamped
frequency, are expressed as functions of the two dominant
poles. Since the output response is characterized by the
two parameters, the parameters are used to define the
objective function and constraints, which form a constrained
multivariable nonlinear optimization problem. After that,
the optimization problem is solved using gradient projection
method. One advantage of our approach is the ability to
explicitly control the maximum overshoot of the observation
points. Two numerical examples are given.
Moderators: M. Servit, Czech Technical University, Czech Republic,
R. Peset Llopis, Philips Research Laboratories, The Netherlands
-
Design-Manufacturing Interface: Part I -- Vision [p 550]
- W. Maly, H.T. Heineken, J. Khare, and P.K. Nag
This paper proposes a vision for a new research
domain emerging on the interface between design and
manufacturing of VLSI circuits. The key objective of
this domain is the minimization of the mismatch
between design and manufacturing which is rapidly
growing with the increase in complexity of VLSI
designs and IC technologies. This broad objective is partitioned
into a number of specific tasks. Often, one of
the most important task is the extraction of VLSI design
attributes that may be relevant from a manufacturing
efficiency standpoint. The second task is yield analysis
performed to detect process and design attributes
responsible for inadequate yield. This paper postulates
both, an overall change in the design-manufacturing
interface, as well as a methodology to address the growing
design-manufacturing mismatch. Attributes of a
number of tools needed for this purpose are discussed as
well.
-
Design-Manufacturing Interface: Part II -- Applications [p 557]
- W. Maly, H.T. Heineken, J. Khare, P.K. Nag, P. Simon, and C. Ouyang
This paper illustrates via examples problems at the
design-manufacturing interface that exist in the IC industry
today, and the ability of the YAN/PODEMA framework [1]
in solving these problems. The need for further development
of the framework is also emphasized.
-
Performance-Manufacturability Tradeoffs in IC Design [p 563]
- H.T. Heineken and W. Maly
Traditional VLSI design objectives are to minimize
time-to-first-silicon while maximizing performance. Such
objectives lead to designs which are not optimum from manufacturability
perspective. The objective of this paper is to illustrate the above
claim by performing performance/manufacturability tradeoff analysis. The
basis for such an analysis, in which the relationship between a product's
clock frequency and wafer productivity is modeled, is described in detail
. New applied yield models are discussed as well.
Moderators: C. Landrault, LIRMM, France,
D. Medina, Italtel, Italy
-
Fast Sequential Circuit Test Generation Using High-Level and
Gate-Level Techniques [p 570]
- E.M. Rudnick, R. Vietti, A. Ellis, F. Corno,
P. Prinetto, and M. Sonza Reorda
A new approach for sequential circuit test generation
is proposed that combines software testing based
techniques at the high level with test enhancement techniques
at the gate level. Several sequences are derived to
ensure 100% coverage of all statements in a high-level
VHDL description, or to maximize coverage of paths.
The sequences are then enhanced at the gate level to
maximize coverage of single stuck-at faults. High fault
coverages have been achieved very quickly on several
benchmark circuits using this approach.
-
State Relaxation Based Subsequence Removal for Fast Static Compaction
in Sequential Circuits [p 577]
- M.S. Hsiao and S.T. Chakradhar
We extend the subsequence removal technique to provide
significantly higher static compaction for sequential
circuits. We show that state relaxation techniques
can be used to identify more or larger cycles in a test
set. State relaxation creates more opportunities for
subsequence removal and hence, results in better compaction.
Relaxation of a state is possible since not all
memory elements in a finite state machine have to be
specified for a state transition. The proposed technique
has several advantages: (1) test sets that could
not be compacted by existing subsequence removal
techniques can now be compacted, (2) the size of cycles
in a test set can be significantly increased by state
relaxation and removal of the larger sized cycles leads
to better compaction, (3) only two fault simulation
passes are required as compared to trial and re-trial
methods that require multiple fault simulation passes,
and (4) significantly higher compaction is achieved in
short execution times as compared to known subsequence removal
methods. Experiments on ISCAS89 sequential benchmark circuits
and several synthesized circuits show that the proposed
technique consistently results in significantly higher
compaction in short execution times.
-
Procedures for Static Compaction of Test Sequences for Synchronous
Sequential Circuits Based on Vector Restoration [p 583]
- R. Guo, I. Pomeranz, and S.M. Reddy
We propose several compaction procedures for synchronous
sequential circuits based on test vector restoration.
Under a vector restoration procedure, all or most of
the test vectors are first omitted from the test sequence.
Test vectors are then restored one at a time or in subsequences
only as necessary to restore the fault coverage of
the original sequence. Techniques to speed-up the restoration
process are investigated. These include limiting the
test vectors initially omitted from the test sequence, consideration
of several faults in parallel during restoration,
and the use of a parallel fault simulator.
Moderators: J. van Meerbergen, Philips Research Laboratories,
The Netherlands, H. Hermanani, Lebanese American University, Lebanon
-
Architectural Simulation in the Context of Behavioral Synthesis [p 590]
- A. Jemai, P. Kission, and A.A. Jerraya
This paper deals with integrating an interactive
simulator within a behavioral synthesis tool, thereby
allowing concurrent synthesis and simulation. The
resulting environment provides a cycle based simulation
of a behavioral module under synthesis. The simulator
and the behavioral synthesis are based on a single
model that allows to link the behavioral description and
the architecture produced by synthesis. The basic
simulation-synthesis model is extended in order to allow
for concurrent architectural simulation of several
modules under synthesis.
This paper also discusses an implementation of this
concept resulting in a simulator, called AMIS. This tool
assists the designer for understanding the results of
behavioral synthesis and for architecture exploration. It
may also be used to debug the behavioral specification.
-
Scheduling of Outputs in Grammar-Based Hardware Synthesis of Data
Communication Protocols [p 596]
- J. Öberg, A. Kumar, and A. Hemani
We present a grammar based specification method for hardware
synthesis of data communication protocols in which
the specification is independent of the port size. Instead, it
is used during the synthesis process as a constraint. When
the width of the output assignments exceed the chosen output
port width, the assignments are split and scheduled
over the available states. We present a solution to this problem
and results of applying it to some relevant problems.
-
Concurrent Error Recovery with Near-Zero Latency in Synthesized ASICs [p 604]
- S.N. Hamilton and A. Orailoglu
The importance of fault tolerant design has been
steadily increasing as reliance on error free electronics
continues to rise in critical military, medical, and
automated transportation applications. While rollback
and checkpointing techniques facilitate area efficient
fault tolerant designs, they are inapplicable to a large
class of time-critical applications. We have developed
a novel synthesis methodology that avoids rollback, and
provides both zero reduction in throughput and near-zero
error latency. In addition, our design techniques
reduce power requirements associated with traditional
approaches to fault tolerance.
Moderators: T. Filkorn, Siemens AG, Germany,
H. Eveking, Darmstadt University of Technology, Germany
-
Dynamic Minimization of Word-Level Decision Diagrams [p 612]
- S. Höreth and R. Drechsler
Word-Level Decision Diagrams (WLDDs), like
*BMDs and K*BMDs, have recently been introduced
as a data structure for verification. The size of
WLDDs largely depends on the chosen variable ordering,
i.e. the ordering in which variables are encountered,
and on the decompositions carried out in each
node. In this paper we present a framework for dynamic
minimization of WLDDs. We discuss the difficulties with
previous techniques if applied to WLDDs and present a new
approach that efficiently adapts both variable ordering and
decomposition type choice. Experimental results demonstrate
that this method out-performs "classical" reordering with
respect to run-time and representation size during dynamic
minimization of word-level functions.
-
Sequential Equivalence Checking without State Space Traversal [p 618]
- C.A.J. van Eijk
Because general algorithms for sequential equivalence
checking require a state space traversal of the product
machine, they are computationally expensive. In this paper,
we present a new method for sequential equivalence
checking which utilizes functionally equivalent signals to
prove the equivalence of both circuits, thereby avoiding the
state space traversal. The effectiveness of the proposed
method is confirmed by experimental results on retimed
and optimized ISCAS'89 benchmarks.
-
On the Reuse of Symbolic Simulation Results for Incremental Equivalence
Verification of Switch-Level Circuits [p 624]
- L. Ribas-Xirgo and J. Carrabina-Bordoll
Incremental methods are successfully applied to deal
with successive verifications of slightly modified switch-level networks.
That is, only those parts affected by the changes are symbolically
traversed for verification. In this paper, we present an incremental
technique for symbolic simulators which is inspired in both existing
incremental techniques for non-symbolic simulators and a token-passing
mechanisms in Petri nets.
Organizer & Moderator: Erik Jan Marinissen, Philips Research Labs, The
Netherlands co-organized in cooperation with IEEE's Design & Test of
Computers
Speakers: Karel van Doorselaer, Alcatel Telecom, Belgium
Sridhar Narayanan, Sun Microsystems, USA
Gert Jan van Rootselaar, Philips Research Labs, The Netherlands
-
Silicon Debug of Systems-on-Chips [p 632]
-
Modern semiconductor process technologies, advanced design tools, and the
reinvented reuse paradigm enable the design of very complex ICs. Some call
these ICs 'system-on-chip', referring to the fact that their functionality
could until recently only be implemented by one or several PCBs filled with
ICs. While it was always difficult to locate design errors, guaranteeing that
a deep sub-micron 'system-on-chip' is design error free is a real challenge.
Floating specifications, growing geographically-spread design teams,
time-to-market pressure, and the increasing distance of IC designers to actual
silicon all make it likely that 'buggy' hardware will become as common as
'buggy' software. Of course our industry does whatever is possible within
given time and money budgets to prevent design errors before first silicon.
Hereto techniques as simulation, emulation, and formal verification are used.
However, all these techniques only deal with models of the IC, which do not
take into account all effects that might occur on real silicon, and high
computational costs often prevent exhaustive error coverage. In order to find
design errors before the customer does, debug of actual silicon samples is
inevitable. This Hot Topic session provides an overview of the
state-of-the-art in physical and electrical silicon debug. The speakers
address techniques currently in use, their applications and their limitations,
and the research challenges for the future.
Moderators: G. Gielen, Katholieke Universiteit Leuven, Belgium,
C. Descleves, Dolphin Integration, France
-
Hierarchical Characterization of Analog Integrated CMOS Circuits [p 636]
- J. Ecküller, M. Gröpl, and H. Gräb
This paper presents a new method for hierarchical
characterization of analog integrated circuits. For each
circuit class, a fundamental set of performances is
defined and extracted topology-independently. A circuit
being characterized is decomposed in general
subcircuits. Sizing rules of these topology-independent
subcircuits are included into the characterization by
functional constraints. In this way, bad circuit sizing is
detected and located.
-
EASY -- A System for Computer-Aided Examination of Analog Circuits [p 644]
- G. Dröge, M. Thole, and E.-H. Horneber
The EASY analog design system includes a qualitative
analysis tool for examination of the principal aptitude of
a chosen circuit structure, as well as a symbolic analysis
component. It allows the deduction of compact but sufficiently
accurate design equations. These tools support the
first steps of the design process and give insight in the behavior
of the analog circuit.
-
A Formal Approach to Verification of Linear Analog Circuits with
Parameter Tolerances [p 649]
- L. Hedrich and E. Barke
This contribution presents an approach to formal
verification of linear analog circuits with parameter
tolerances. The method proves that an actual circuit
fulfills a specification in a given frequency interval for all
parameter variations. It is based on a curvature driven
bound computation for value sets using interval
arithmetic. Some examples demonstrate the feasibility of
our approach.
Moderators: A. ten Berg, Philips Research Laboratories, The Netherlands,
M. Berkelaar, Eindhoven University of Technology, The Netherlands
-
Synthesis of Wiring Signature-Invariant Equivalence Class Circuit
Mutants and Applications to Benchmarking [p 656]
- D. Ghosh, N. Kapur, J. Harlow III, and F. Brglez
This paper formalizes the synthesis process of wiring signature-invariant
(WSI) combinational circuit mutants. The signature σo
is defined by a
reference circuit ηo, which itself is modeled as a canonical form
of a directed bi-partite graph. A wiring perturbation γ induces a perturbed
reference circuit ηγ. A number of mutant circuits
ηγi can
be resynthesized from the perturbed circuit ηγ. The mutants
of interest are the ones that belong to the wiring-signature-invariant
equivalence class Nσo, i.e. the mutants ηγi ∈ Nσo.
Circuit mutants ηγi ∈ N σo
have a number of useful properties. For
any wiring perturbation γ, the size of the wiring-signature-invariant
equivalence class is huge. Notably, circuits in this class are
not random, although for unbiased testing and benchmarking
purposes, mutant selections from this class are typically random.
For each reference circuit, we synthesized eight equivalence
subclasses of circuit mutants, based on 0 to 100% perturbation.
Each subclass contains 100 randomly chosen mutant
circuits, each listed in a different random order. The 14,400
benchmarking experiments with 3200 mutants in 4 equivalence
classes, covering 13 typical EDA algorithms, demonstrate that
an unbiased random selection of such circuits can lead to statistically
meaningful differentiation and improvements of existing and new algorithms.
Keywords: signature-invariance, equivalence class, circuit
mutants, benchmarking.
-
Technology Mapping for Minimizing Gate and Routing Area [p 664]
- A. Lu, G. Stenz, and F.M. Johannes
This paper presents a technology mapping approach for
the standard cell technology, which takes into account both
gate area and routing area so as to minimize the total
chip area after layout. The routing area is estimated using
two parameters available at the mapping stage; one is
the fanout count of a gate, and the other is the "overlap
of fanin level intervals". To estimate the routing area in
terms of accurate fanout counts, an algorithm is proposed
which solves the problem of dynamic fanout changes in the
mapping process. This also enables us to calculate the gate
area more accurately. Experimental results show that this
approach provides an average reduction of 15% in the final
chip area after placement and routing.
-
Exploiting Symbolic Techniques for Partial Scan Flip Flop Selection [p 670]
- F. Corno, P. Prinetto, M. Sonza Reorda, and M. Violante
Partial Scan techniques have been widely accepted
as an effective solution to improve sequential ATPG
performance while keeping acceptable area and performance
overheads. Several techniques for flip-flop
selection based on structural analysis have been presented
in the literature. In this paper, we first propose a
new testability measure based on the analysis of the
circuit State Transition Graph through symbolic techniques.
We then describe a scan flip flop selection algorithm
exploiting this measure. We resort to the identification
of several circuit macros to address large sequential
circuits. When compared to other techniques,
our approach shows good results, especially when it is
used to optimize a set of flip-flops previously selected by
means of structural analysis.
Moderators: C. Piguet, CSEM, Switzerland,
E. Macii, Politecnico di Torino, Italy
-
Temperature Effect on Delay for Low Voltage Applications [p 680]
- J.M. Daga, E. Ottaviano, and D. Auvergne
This paper presents one of the first analysis of the
temperature dependence of CMOS integrated circuit
delay at low voltage. Based on a low voltage extended
Sakurai's a-power current law, a detail analysis of the
temperature and voltage sensitivity of CMOS structure
delay is given. Coupling effects between temperature and
voltage are clearly demonstrated. Specific derating
factors are defined for the low voltage range (1-3V T0 ).
Experimental validations are obtained on specific ring
oscillators integrated on a 0.7 mm process by comparing
the temperature and voltage evolution of the measured
oscillation period to the calculated ones. A low
temperature sensitivity operating region has been clearly
identified and appears in excellent agreement with the
expected calculated values.
-
Data Driven Power Optimization of Sequential Circuits [p 686]
- Q. Wang and S.B.K. Vrudhula
In this paper we present an efficient technique to reduce
the power dissipation in a technology mapped CMOS
sequential circuit based on logic and structural transformations.
The power reduction is achieved by adding sequential redundancies
from low switching activity gates
to high switching activity gates (targets) such that the
switching activities at the output of the targets are significantly
reduced. We show that the power reducing transformations result
in a circuit that is a valid replacement
of the original. The notion of validity used here is that
of a delay safe replacement [11, 12]. The potential transformations
are found by direct logic implications applied
to the circuit netlist. Therefore the complexity of the proposed
transformation is polynomial in the size of the circuit, allowing
the processing of large designs.
-
Gated Clock Routing Minimizing the Switched Capacitance [p 692]
- J. Oh and M. Pedram
This paper presents a zero-skew gated clock routing technique
for VLSI circuits. The gated clock tree has masking gates at
the internal nodes of the clock tree, which are selectively
turned on and off by the gate control signals during the active
and idle times of the circuit modules to reduce switched
capacitance of the clock tree. This work extends the work of
[4] so as to account for the switched capacitance and the area
of the gate control signal routing. Various tradeoffs between
power and area for different design options and module
activities are discussed and detailed experimental results are
presented.
-
Exact and Approximate Estimation for Maximum Instantaneous Current
of CMOS Circuits [p 698]
- Y.-M. Jiang and K.-T. Cheng
We present an integer-linear-programming-based approach for
estimating the maximum instantaneous current
through the power supply lines for CMOS circuits. It
produces the exact solutions for the maximum instantaneous
current for small circuits, and tight upper bounds for
large circuits. We formulate the maximum instantaneous
current estimation problem as an integer linear programming
(ILP) problem, and solve the corresponding ILP formulae
to obtain the exact solution. For large circuits we
propose to partition the circuits, and apply our ILP-based
approach for each sub-circuit. The sum of the exact solutions
of all sub-circuits provides an upper bound of the
exact solution for the entire circuit. Our experimental
results show that the upper bounds produced by our
approach combined with the lower bounds produced by a
genetic-algorithm-based approach confine the exact solution
to a small range.
Co-ordinator: Ivo Bolsens, IMEC, Belgium
Moderator: Ivo Bolsens, IMEC, Belgium
Speakers: Norbert Wehn, University of Kaiserslautern, Germany
Soren Hein, Siemens, Germany
Francky Catthoor, IMEC, Belgium
Roelof Salters, Philips Research Labs, The Netherlands
-
Embedded DRAM Architectural Trade-Offs [p 704]
- N. Wehn and S. Hein
In this paper we discuss system-related aspects in embedded
DRAM/logic designs. We focus on large embedded
memories which have to be implemented as DRAMs.
-
Energy-Delay Efficient Data Storage and Transfer Architectures: Circuit
Technology Versus Design Methodology Solutions [p 709]
- F. Catthoor
Both in custom and programmable instruction-set
processors for data-dominated multi-media applications,
many of the architecture components are intended to solve
the data transfer and storage issues. Recent experiments at
several locations have clearly demonstrated that due to this
fact, the main power (and largely also area) cost is situated
in the memory units and the communication hardware. In
this paper, the main reasons for this problem will be reviewed
and a perspective will be provided on the expected
near-future evolution. It will be shown that the circuit and
process technology advances have been very significant in
the past decade. Still, these are not sufficient to fully solve
this power and area bottle-neck which has been created
in the same period. Therefore, also several possible design
methodology remedies will be proposed for this critical
design issue, with emphasis on effective system-level
memory management methodologies. These promise very
large savings on energy-delay also on area for multi-media
applications, while still meeting the real-time constraints.
Moderators: J. Franca, IST, Lisbon, Portugal,
H. Kerkhoff, University of Twente, The Netherlands
-
Hierarchical Top-Down Design of Analog Sensor Interfaces: From
System-Level Specifications Down to Silicon [p 716]
- J. Vandenbussche, S. Donnay, F. Leyn, G. Gielen, and W. Sansen
The complete application of a hierarchical top-down
design methodology to analog sensor interface front-ends is
presented: from system-level specifications down to
implementation in silicon, including high-level synthesis,
analog block generation and layout generation.
A new approach for implementing accurate and fast
power/area estimators for the different blocks in the
architecture is described. These estimators provide the
essential link between the high-level synthesis and the block
generation in our hierarchical top-down methodology.
The methodology is illustrated by means of the design of a
complex and realistic example. Measurement results are
included.
-
A Systems Theoretic Approach to Behavioural Modeling and Simulation
of Analog Functional Blocks [p 721]
- R. Rosenberger and S.A. Huss
Analog simulation methodologies for the generation
of macromodels of analog functional blocks, as reported
in literature, are of limited use in practical circuit simulation
due to frequent accuracy and efficiency problems. In this paper,
a new approach to model the behaviour of nonlinear functional
blocks is proposed. The approach is based upon the principles of
systems theory. The outlined methodology supports the mapping of
models from component into behavioural level. The nonlinearity of
complex analog modules is reflected efficiently while the electrical
signals are maintained.
-
Switching Response Modeling of the CMOS Inverter for Sub-Micron Devices [p 729]
- L. Bisdounis, S. Nikolaidis, O. Koufopavlou, and C.E. Goutis
In this paper an accurate, analytical model for the
evaluation of the CMOS inverter delay in the sub-micron
regime, is presented. A detailed analysis of the inverter
operation is provided which results to accurate
expressions describing the output waveform. These
analytical expressions are valid for all the inverter
operation regions and input waveform slopes. They take
into account the influences of the short-circuit current
during switching, and the gate-to-drain coupling
capacitance. The presented model shows clearly the
influence of the inverter design characteristics, the load
capacitance, and the slope of the input waveform driving
the inverter on the propagation delay. The results are in
excellent agreement with SPICE simulations.
Moderators: M. Berkelaar, Eindhoven University of Technology,
The Netherlands, L. Stok, IBM T.J. Watson Research Center, USA
-
On Removing Multiple Redundancies in Combinational Circuits [p 738]
- S.-C. Chang, D.I. Cheng, and C.-W. Yeh
Redundancy removal is an important step in combinational logic optimization.
After a redundant wire is removed, other originally redundant wires may become
irredundant, and some originally irredundant wires may become redundant. When
multiple redundancies exist in a circuit, this creates a problem where we need
to decide which redundancy to remove first. In this paper, we present an
analysis and a very efficient heuristic to deal with multiple redundancies.
We associate with each redundant wire a Boolean function that describes how the
wire can remain redundant after removing other wires. When multiple
redundancies exist, this set of Boolean functions characterizes the global
relationship among redundancies.
-
Multi-Output Functional Decomposition with Exploitation of Don't Cares
[p 743]
- C. Scholl
Functional decomposition is an important technique in
logic synthesis, especially for the design of lookup table
based FPGA architectures.
We present a method for functional decomposition with
a novel concept for the exploitation of don' t cares thereby
combining two essential goals: the minimization of the
number of decomposition functions in the current decomposition
step and the extraction of common subfunctions
for multi-output Boolean functions.
The exploitation of symmetries of Boolean functions
plays an important role in our algorithm as a means to
minimize the number of decomposition functions not only
for the current decomposition step but also for the (recursive)
decomposition algorithm as a whole. Experimental results prove
the effectiveness of our approach.
-
An Efficient Divide and Conquer Algorithm for Exact Hazard Free
Logic Minimization [p 749]
- J.W.J.M. Rutten, M.R.C.M. Berkelaar, C.A.J. van Eijk, and M.A.J.
Kolsteren
In this paper we introduce the first divide and conquer algorithm
that is capable of exact hazard-free logic minimization
in a constructive way. We compare our algorithm with
the method of Dill/Nowick, which was the only known method
for exact hazard-free minimization. We show that our algorithm
is much faster than the method proposed by Dill/Nowick
by avoiding a significant part of the search space. We
argue that the proposed algorithm is a promising framework
for the development of efficient heuristic algorithms.
-
Restructuring Logic Representations with Easily Detectable Simple
Disjunctive Decompositions [p 755]
- H. Sawada, S. Yamashita, and A. Nagoya
Simple disjunctive decomposition is a special case of
logic function decompositions, where variables are divided
into two disjoint sets and there is only one newly introduced
variable. This paper presents that many simple disjunctive
decompositions can be found easily by detecting symmetric
variables or checking variable cofactors. We also propose
an algorithm that constructs a new logic representation for
a simple disjunctive decomposition by assigning constant
values to variables in the original representation. The algorithm
enables us to apply the decomposition with keeping
good structures of the original representation. We have
performed experiments to restructure fanout free cones of
multi-level logic circuits, and obtained better results than
when not restructuring them.
Moderators: W. Nebel, University of Oldenburg and OFFIS, Germany,
J. Benkoski, Synopsys, France
-
Power Estimation of Behavioral Descriptions [p 762]
- F. Ferrandi, F. Fummi, E. Macii, M. Poncino, and D. Sciuto
This paper presents a methodology for power estimation of designs
described at the behavioral-level as the interconnection of
functional modules. The input/output behavior of each module
is implicitly stored using BDDs, and the power consumed by
the network is estimated using a novel and accurate entropy-based
approach. As a demonstration example, we have used the
proposed power estimation technique to evaluate and compare
the effects of some architectural transformations applied to a
reference design specification on the power dissipation of the
corresponding implementations.
-
Characterization-Free Behavioral Power Modeling [p 767]
- A. Bogliolo, L. Benini, and G. De Micheli
We propose a new approach to RT-level power modeling
for combinational macros, that does not require simulation-based
characterization. A Ppattern-dependent power model for a macro
is analytically constructed using only structural information
about its gate-level implementation. The approach has three main
advantages over traditional techniques: i) it provides models
whose accuracy does not depend on input statistics, ii) it offers a
wide range of trade-off between accuracy and complexity, and iii)
it enables the construction of pattern-dependent conservative upper
bounds.
-
Trace-Driven Steady-State Probability Estimation in FSMs with
Application to Power Estimation [p 774]
- D. Marculescu, R. Marculescu, and M. Pedram
This paper illustrates, analytically and
quantitatively, the effect of high-order temporal correlations on
steady-state and transition probabilities in finite state machines
(FSMs). As the main theoretical contribution, we extend the
previous work done on steady-state probability calculation in
FSMs to account for complex spatiotemporal correlations
which are present at the primary inputs when the target
machine models real hardware and receives data from real
applications. More precisely: 1) using the concept of
constrained reachability analysis, the correct set of Chapman-Kolmogorov
equations is constructed; and 2) based on
stochastic complementation and iterative aggregation/
disaggregation techniques, exact and approximate methods for
finding the state occupancy probabilities in the target machine
are presented. From a practical point of view, we show that
assuming temporal independence or even using first-order
temporal models is not sufficient due to the inaccuracies
induced in steady-state and transition probability calculations.
Experimental results show that, if the order of the source is
underestimated, not only the set of reachable sets is incorrectly
determined, but also the steady-state probability values can be
more than 100% off from the correct ones. This strongly
impacts the accuracy of the total power estimates that can be
obtained via probabilistic approaches.
Moderators: L. Claesen, IMEC, Belgium, C. Delgado Kloos,
ETSI Telecommunicacion, Spain
-
Efficient Verification Using Generalized Partial Order Analysis [p 782]
- S. Vercauteren, D. Verkest, G. de Jong, and B. Lin
This paper presents a new formal method for the efficient
verification of concurrent systems that are modeled using a
safe Petri net representation. Our method generalizes upon
partial-order methods to explore concurrently enabled conflicting
paths simultaneously. We show that our method can
achieve an exponential reduction in algorithmic complexity
without resorting to an implicit enumeration approach.
-
Efficient Encoding Schemes for Symbolic Analysis of Petri Nets [p 790]
- E. Pastor and J. Cortadella
Petri nets are a graph-based formalism appropriate to model
concurrent systems such as asynchronous circuits or network protocols.
Symbolic techniques based on Binary Decision Diagrams
(BDDs) have emerged as one of the strategies to overcome the
state explosion problem in the analysis of systems modeled by Petri
nets. The existing techniques for state encoding use a variable-per-place
strategy that leads to encoding schemes with very low
density. This drawback has been partially mitigated by using
Zero-Suppressed BDDs, that provide a typical reduction of BDD
sizes by a factor of two.
This work presents novel encoding schemes for Petri nets. By
using algebraic techniques to analyze the topology of the net,
sets of places 'structurally related' can be derived and encoded
by only using a logarithmic number of boolean variables. Such
approach allows to drastically decrease the number of variables
for state encoding and reduce memory and CPU requirements
significantly.
-
Propagation of Last-Transition-Time Constraints in Gate-Level
Timing Analysis [p 796]
- M. Kassab, E. Cerny, S. Aourid, and T. Krodel
Waveform narrowing is an attractive framework for
circuit delay verification as it can handle different delay
models and component delay correlation efficiently. The
method can give false negative results because it relies on
local consistency techniques. We present two methods to
reduce this pessimism: 1) global timing implications and
necessary assignments, and 2) a case analysis procedure
that finds a test vector that violates the timing check or
proves that no violation is possible. Under floating-mode,
global implications eliminate timing check violation without
case analysis in the c1908 benchmark, while for a tighter
requirement case analysis finds a test vector after only 5
backtracks.
-
Combinational Verification Based on High-Level Functional Specifications
[p 803]
- E.I. Goldberg, Y. Kukimoto, and R.K. Brayton
We present a new combinational verification technique
where the functional specification of a circuit under verification
is utilized to simplify the verification task. The main
idea is to assign to each primary input a general function,
called a coordinate function, instead of a single variable
function as in most BDD-based techniques. BDDs of intermediate
nodes are then constructed based on these coordinate
functions in a topological order from primary inputs to
primary outputs. Coordinate functions depend on primary
input variables and extra variables. Therefore combinational
verification is performed not over the set of primary
input variables but over the extended set of variables. Coordinate
functions are chosen in such a way that in the process
of computing intermediate functions the dependency on the
primary input variables is gradually replaced with that on
the extra variables, thereby making boolean functions associated
with primary outputs simple functions only in terms
of the extra variables. We show that such a smart choice of
coordinate functions is possible with the help of the high-level
functional specification of the circuit.
Moderators: A. Richardson, University of Lancaster, UK,
M. Sachdev, Philips Research Laboratories, The Netherlands
-
Switch-Level Fault Coverage Analysis for Switched-Capacitor Systems [p 810]
- S. Mir, A. Rueda, D. Vázquez, and J.L. Huertas
An approach to test optimization in switched-capacitor systems
based on fault simulation at switch-level is presented in this paper.
The advantage of fault simulation at this granularity level is that it
facilitates test integration as early as possible in the design of these
systems. Due to their mixed-signal nature, both catastrophic and
parametric faults must indeed be considered for test optimization.
Adequate switch-level fault models are presented. Test stimuli
and test measures can be selected as a function of fault coverage.
The impact of design parameters such as switch resistance on fault
coverage is studied and design parts of poor testability are located.
-
Optimized Implementations of the Multi-Configuration DFT Technique
for Analog Circuits [p 815]
- M. Renovell, F. Azaïs, and Y. Bertrand
The paper describes an approach to optimize
the application of the multi-configuration DFT technique
for analog circuits. This technique allows to emulate the
circuit in a number of new test configurations targeting the
maximum fault coverage. The brute force application of
the multi-configuration is shown to produce a very
significant improvement of the original poor testability. An
optimized approach is proposed to apply this DFT
technique in a more refined way. The optimization problem
consists in choosing among the various permitted test
configurations, a set that leads to the best testability/cost
trade-off. This set is selected according to ordered
requirements: (i) the fundamental requirement of
maintaining the maximum fault coverage and (ii)
non-fundamental requirements of satisfying some
user-defined cost functions such as test time, silicon
overhead or performance degradation. Results are given
that exhibit very interesting features in terms of either test
procedure simplicity or DFT penalty reduction.
-
Analog Test Design with IDD Measurements for the Detection of
Parametric and Catastrophic Faults [p 822]
- W.M. Lindermeir, T.J. Vogels, and H.E. Graeb
Earlier approaches dealt with the detection of
catastrophic faults based on IDD monitoring. Consideration
of the more subtle parametric faults and
the ADC quantization noise, however, is essential for
high-quality analog testing. The paper presents a new
design method for analog test of parametric and catastrophic
faults by IDD monitoring. ADC quantization noise is systematically
considered throughout the method. Results prove its effectiveness.
Moderators: L. Stok, IBM T.J. Watson Research Center, USA
A. ten Berg, Philips Research Laboratories, The Netherlands
-
A New Paradigm for Dichotomy-Based Constrained Encoding [p 830]
- O. Coudert
One essential step in sequential logic synthesis consists of
finding a state encoding that meets some requirements, such as
optimal implementation, or correctness in the case of asynchronous
FSMs. Dichotomy-based constrained encoding is more general than other
constrained encoding frameworks, but it is also more difficult to
solve. This paper introduces a new formalization of this problem,
which leads to original exact and heuristic algorithms. Experimental
results show that the resulting exact solver outperforms the previous
approaches.
-
A Dynamic Model for the State Assignment Problem [p 835]
- M. Martínez, M.J. Avedillo, J.M. Quintana, and J.L. Huertas
Traditionally, state assignment algorithms follow the
two-step strategy of first constraint generation and secondly
constraint-guided encoding. There are well known
drawbacks in both currently used models for constraint
generation. Approaches following the input model generate
face constraints without taking into account the sharing of
logic among next state lines. Approaches following the
input-output model generate face constraints for a priori
determined set of dominance/disjunctive relations among
the codes of the states which may not hold in final encoding.
To overcome these limitations, we propose a dynamic input
model which implements both above cited steps
concurrently. The dynamic constraints are of the face type
but they are generated during the encoding process and so
take advantage of actual relations among partial codes. A
general algorithm based on this model and which can target
two-level as well as multiple-level implementations is
described. Results obtained with the algorithm on the
IWLS'93 machines are shown and they compare favorably
with standard tools.
-
Efficient Minarea Retiming of Large Level-Clocked Circuits [p 840]
- N. Maheshwari and S.S. Sapatnekar
Delay-constrained area optimization is an important
step in synthesis of VLSI circuits. Minimum
area (minarea) retiming is a powerful technique to
solve this problem. The minarea retiming problem
has been formulated as a linear program; in this work
we present techniques for reducing the size of this linear
program and efficient techniques for generating it.
This results in an efficient minarea retiming method
for large level-clocked circuits (with tens of thousands
of gates).
Moderators: M. Pedram, University of Southern California, USA,
M. Poncino, Politecnico di Torino, Italy
-
IMPACT: A High-Level Synthesis System for Low Power Control-Flow
Intensive Circuits [p 848]
- K.S. Khouri, G. Lakshminarayana, and N.K. Jha
In this paper, we present a comprehensive high-level synthesis
system that is geared towards reducing power consumption
in control-flow intensive circuits. An iterative improvement algorithm
is at the heart of the system. The algorithm searches
the design space by handling scheduling, module selection, resource
sharing and multiplexer network restructuring simultaneously.
The scheduler performs concurrent loop optimization and
implicit loop unrolling. It minimizes the expected number of cycles
of the schedule without compromising on the minimum and
maximum schedule lengths. A fast simulation technique based
on trace manipulation aids power estimation in driving synthesis
in the right direction. Experimental results demonstrate power
reduction of up to85% with minimal overhead in area over area-optimized
designs operating at 5V.
-
Instruction Scheduling for Power Reduction in Processor-Based System Design [p 855]
- H. Tomiyama, T. Ishihara, A. Inoue, and H. Yasuura
This paper propose an instruction
scheduling technique to reduce power consumed
for off-chip driving. The technique minimizes the
switching activity of a data bus between an on-chip
cache and a main memory when instruction cache
misses occur. The scheduling problem is formulated and
a scheduling algorithm is also presented.
Experimental results demonstrate the effectiveness
and the efficiency of the proposed algorithm.
-
Address Bus Encoding Techniques for System-Level Power Optimization [p 861]
- L. Benini, G. De Micheli, E. Macii, D. Sciuto, and C. Silvano
The power dissipated by system-level buses is the largest contribution
to the global power of complex VLSI circuits. Therefore,
the minimization of the switching activity at the I/O interfaces
can provide significant savings on the overall power budget.
This paper presents innovative encoding techniques suitable for
minimizing the switching activity of system-level address buses.
In particular, the schemes illustrated here target the reduction
of the average number of bus line transitions per clock cycle.
Experimental results, conducted on address streams generated
by a real microprocessor, have demonstrated the effectiveness
of the proposed methods.
Moderators: M. Kovac, University of Zagreb, Croatia,
W. Glauert, University of Erlangen-Nurnberg, Germany
-
A Scalable Architecture for Multi-Threaded JAVA Applications [p 868]
- M. Mrva, K. Buchenrieder, and R. Kress
The paper presents a scalable architecture for multi-threaded
Java applications. Threads enable modeling of
concurrent behavior in a more or less natural way. Thus
threads give a migration path to multi-processor
machines. The proposed architecture consists of multiple
application-specific processing elements, each able to
execute a single thread at one time. The architecture is
evaluated by implementing a portable and scalable Java
machine onto an FPGA board for demonstration.
-
Hardware/Software Co-Design of a Fuzzy RISC Processor [p 875]
- V. Salapura and M. Gschwind
In this paper, we show how hardware/software co-evaluation
can be applied to instruction set definition. As
a case study, we show the definition and evaluation of instruction
set extensions for fuzzy processing. These instructions
are based on the use of subword parallelism to
fully exploit the processor�s resources by processing multiple
data streams in parallel. The proposed instructions
are evaluated in software and hardware to gain a balanced
view of the costs and benefits of each instruction. We have
found that a simple instruction optimized to perform fuzzy
rule evaluation offers the most benefit to improve fuzzy processing
performance.
The instruction set extensions are added to a RISC processor
core based on the MIPS instruction set architecture.
The core has been described in VHDL so that hardware
implementations can be generated using logic synthesis.
-
Innovative System-Level Design Environment Based on FORM for
Transport Processing System [p 883]
- K. Higuchi and K. Shirakawa
This paper presents a system-level design environment for date transport
processing systems. In this environment, designers can easily verify
system behavior by formally defining data structures and their related
actions, without considering detailed timing. In addition, the verified
specification can be translated into synthesizable RTL descriptions
by a dedicated RTL generator. Thus, using lower-level EDA tools, actual
hardware can be obtained directly from a system-level specification.
Moderators: J.L. Huertas, Centro Nacional de Microelectronica, Spain,
J. Pikkarainen, Nokia Mobile Phones, Finland
-
Efficient Techniques for Accurate Modeling and Simulation of Substrate
Coupling in Mixed-Signal IC's [p 892]
- J.P. Costa, M. Chou, and L.M. Silveira
Industry trends aimed at integrating higher levels of circuit functionality
have triggered a proliferation of mixed analog-digital
systems. Magnified noise coupling through the common chip substrate
has made the design and verification of such systems an
increasingly difficult task. In this paper we present a fast eigen-decomposition
technique that accelerates operator application in
BEM methods and avoids the dense-matrix storage while taking
all of the substrate boundary effects into account explicitly.
This technique can be used for accurate and efficient modeling of
substrate coupling effects in mixed-signal integrated circuits.
-
Efficient DC Fault Simulation of Nonlinear Analog Circuits [p 899]
- M.W. Tian and C.-J.R. Shi
This paper describes a method to improve the efficiency
of nonlinear DC fault simulation. The method uses the
Newton-Raphson algorithm to simulate each faulty circuit.
The key idea is to order the given list of faults in
such a way that the solution of previous faulty circuit
can serve as a good initial point for the simulation
of the next faulty circuit. To build a good ordering,
one step Newton-Raphson iteration is performed for all the
faulty circuits once, and the results are used to quantify
how faulty circuits and eh good circuit are close
in their behaviors. With one-step Newton-Raphson iteration implementation
by Householder's formula, the proposed method has virtually
no overhead. Experimental results on a set of 36 MCNC
benchmark circuits show an average speedup of 4.4 and as
high as 15 over traditional stand-alone fault simulation.
-
An Approach to Realistic Fault Prediction and Layout Design for
Testability in Analog Circuits [p 905]
- J.A. Prieto, A. Rueda, I. Grout, E. Peralías,
J.L. Huertas, and A.M.D. Richardson
This paper presents an approach towards realistic
fault prediction in analog circuits. It exploits the Inductive
Fault Analysis (IFA) methodology to generate explicit
models able to give the probability of occurrence of faults
associated with devices in an analog cell. This information
intends to facilitate the integration of design and test
phases in the development of an IC since it provides a
realistic fault list for simulation before going to the final
layout, and also makes possible layout optimization
towards what we can call layout level design for testability.
Poster Session:
-
Synthesis of Communicating Controllers for Concurrent
Hardware/Software Systems [p 912]
- R. Niemann and P. Marwedel
Two main aspects in hardware/software co-design
are hardware/software partitioning and co-synthesis.
Most co-design approaches work only on one of these
problems. In this paper, an approach coupling hardware/software
partitioning and co-synthesis will be presented, working
fully-automatic. The techniques have been integrated in the
co-design tool Cool 1 supporting the complete design flow from system
specification to board-level implementation for multi-processor
and multi-ASIC target architectures for dataflow dominated
applications.
-
A Knowledge-Based System for Hardware-Software Partitioning [p 914]
- M.L. López, C.A. Iglesias, and J.C. López
This paper presents SHAPES, a tool for hardware-software partitioning.
It is based on two main paradigms: the implementation of the partitioning
tool by means of an expert system, and the use of fuzzy
logic to model the parameters involved in the process.
-
A Formal Description of VHDL-AMS Analogue Systems [p 916]
- T. Kazmierski
A formal definition of the general VHDL-AMS
analogue system has been proposed to relate
the way in which the language affects the
specification of a non-linear discontinuous analogue
system. It has been suggested to model the break set
as a separate system in order to facilitate the
interaction between the analogue equation set and
the digital abstract machine. The significance of the
proposed model is that it may be used in semantic
validation of VHDL-AMS description and may also
facilitate mixed-signal equation formulation for an
underlying VHDL-AMS simulator.
-
Scanning Datapaths: A Fast and Effective Partial Scan Selection Technique [p 921]
- M.L. Flottes, R. Pires, B. Rouzeyre, and L. Volpe
Partial scan DFT is a commonly used technique for
improving testability of sequential circuits while
maintaining overhead as low as possible. In this context, the
selection of the partial scan chain [1] is usually performed
at gate-level (e.g [2],[3]). In this paper, we present a method
for quickly selecting the partial Scan Chain (SC) in
datapath-like circuits. The so-obtained SC is such that the
number of scan FFs is optimized and such that the
achievable fault coverage is the same than with full scan
approach.
-
Universal Strong Encryption FPGA Core Implementation [p 923]
- D. Runje and M. Kovac
IDEA is a symmetric block cipher with a 128-bit key proposed
to replace DES where a strong encryption is required. Many
applications need speed of a hardware encryption implementation
while trying to preserve flexibility and low cost of a software
implementation. In this paper we have presented one solution
of this problem.
Our system architecture uses single core module named Round
to implement IDEA algorithm. Using the core we were able to
implement and test example application in only three days.
This "cf the shef" solution for designing cryptographic application
using IDEA algorithm significantly reduced design cycle, thus
greatly reducing time-to-market and cost of such designs. By
increasing the number of the round modules system designer
can linearly increase speed of the design. This system design
methodology makes it possible to achieve necessary performance,
or to preserve area (and reduce costs) when needed unlike other
known approaches.
We have implemented one round UNICORN architecture in Xilinx FPGA.
After implementation the chip has been tested using the standard
test vectors and it was capable of performing 2.8Mbps encryption
in both ECB and CBC mode.
-
Data Cache Sizing for Embedded Processor Applications [p 925]
- P.R. Panda, N.D. Dutt, and A. Nicolau
We present a technique for determining the best data
cache size required for a given memory-intensive application.
A careful memory and cache line assignment strategy
based on the analysis of the array access patterns effects
a significant reduction in the required data cache size,
with no negative impact on the performance, thereby freeing
vital on-chip silicon area for other hardware resources.
Experiments on several benchmark kernels performed on
LSI Logic's CW4001embedded processor simulator confirm
the soundness of our cache sizing and memory assignment
strategy and the accuracy of our analytical predictions.
-
A Programmable Multi-Language Generator for CoDesign [p 927]
- J.P. Calvez, D. Heller, F. Muller, O. Pasquier
This paper presents an innovative technique to efficiently
develop hardware and software code generators. The
specification model is first converted into its equivalent data
structure. Target programs result from a set of
transformation rules applied to the data structure. These
rules are written in a textual form named Script. Moreover,
transformations for a specific code generator are easier to
describe because our solution uses a template of the required
output as another input. The result is a meta-generator
entirely written in Java. The concept and its implementation
have been demonstrated by developing a C/WxWorks code
generator, a behavioral VHDL generator, a synthesizable
VHDL generator.
-
Register-Constrained Address Computation in DSP Programs [p 929]
- A. Basu, R. Leupers, and P. Marwedel
This paper describes a new code optimization technique for
digital signal processors (DSPs). One important characteristic
of DSP algorithms are iterative accesses to data array elements
within loops. DSPs support efficient address computations for
such array accesses by means of dedicated address generation
units (AGUs). We present a heuristic technique which, given an AGU i
with a fixed number of address registers, minimizes the number
of instructions needed for array address computations in a program loop.
-
Graphical Entry of FSMDs Revisited: Putting Graphical Models
on a Solid Base [p 931]
- T. Müller-Wipperfürth and R. Hagelauer
This paper discusses issues of graphical modelling of
Finite State Machines with Datapath (FSMDs). Tools supporting
the graphical entry of state based systems are usable
by intuition, but need to be based on an exact definition
of semantics of graphical elements. This paper pro-poses
to define semantics of graphical models based on the
hardware description language VHDL.
-
AGENDA: An Attribute Grammar Driven Environment for the Design
Automation of Digital Systems [p 933]
- G. Economakos, G. Papakonstantinou, and P. Tsanakas
Attribute grammars have been used extensively in every
phase of traditional compiler construction. Recently, it
has been shown that they can also be effectively adopted
to handle scheduling algorithms in high-level synthesis.
Their main advantages are modularity and declarative
notation in the development of design automation
environments. In this paper, past results are further
elaborated and more scheduling techniques are
presented and implemented in a flexible environment for
the design automation of digital systems. This novel
approach can be proven valuable for fast evaluation of
new algorithms and techniques in the field.
-
Static Analysis Tools for Soft-Core Reviews and Audits [p 935]
- S. Olcoz, A. Castellví, M. García, and J.-A. Gómez
Three navigation tools are presented to statically and
interactively analyze Soft-Cores described in VHDL, [1].
These tools ease the adoption of mechanisms to perform
reviews and audits procedures similar to those adopted in
software development, [2]. These navigation tools help to
better understand and reuse VHDL Soft-Cores. The three
navigators are integrated in a VHDL-ICE environment,
[3], to get design data management support.
-
A VHDL SGRAM Model for the Validation Environment of a High Performance
Graphic Processor [p 937]
- M.G. Wahl and H. Völkel
To validate the functionality of a new highly complex
graphics processor described in VHDL the working
environment of the processors has to be modelled. In
some cases appropriate models for the external components are
commercially available, in other cases these models have to be
created. In this paper a general memory model for SGRAMs is
presented which had to be implemented to have a flexible simulation
environment for a high speed graphics processor at hand. Key features
are the generality, the support of SGRAM arrays of various shapes
and functions supporting the simulation process. This functionality
goes far beyond the capabilities of currently commercially available
SGRAM models.
-
A Comparing Study of Technology Mapping for FPGA [p 939]
- H.-G. Martin and W. Rosenstiel
This paper investigates some design flows to obtain final
designs on Xilinx XC4000 FPGAs. The examples generated
by high level synthesis were mapped including placement
and routing. This reveals that the common criteria of area
optimal or delay-optimal circuits should be enlarged by
routability and computing time.
-
Fuzzy-Logic Digital-Analogue Interfaces for Accurate Mixed-Signal
Simulation [p 941]
- T.J. Kazmierski
A new approach to mixed-signal circuit
interfacing based on fuzzy logic models is presented.
Due to their continuous rather than discrete
character, fuzzy logic models offer a significant
improvement compared with the classical D-A
interface models. Fuzzy logic D-A interfaces can
represent the boundary between the digital and
analogue worlds accurately without a significant loss
of computational efficiency. The potential of mixed-signal
interfacing based on fuzzy logic is
demonstrated by an example of spike propagation
from the digital to analogue world. A model of
inertial propagation delay and non-linear DC gain
suitable for fuzzy logic gates is also suggested.
-
Optimized Timed Hardware Software Cosimulation without Roll-Back [p 945]
- W. Sung and S. Ha
An optimized hardware software cosimulation method based on
the backplane approach is presented in this paper. To
enhance the performance of cosimulation, efforts are focused
on reducing control packets between simulators as well as concurrent
execution of simulators without roll-back.
-
A Cell and Macrocell Compiler for GaAs VLSI Full-Custom Design [p 947]
- J.A. Montiel-Nelson, V. de Armas, R. Sarmiento, and A. Núnez
A Gallium Arsenide automated layout generation
system (OLYMPO) for SSI, MSI and LSI circuits used
in GaAs VLSI design has been developed. We introduce a
full-custom layout style, called RN-based cell
model, that it is suited to generate low self-inductance
circuit layouts of cells and macrocells. The cell compiler
can be used as a cell library builder and it is embedded
in a random logic macrocell and an iterative logic array
generator. Experimental results demonstrate that OLYMPO
generates complex and compact layouts
and the synthesis process can be interactively used at
the system design level.
-
Architectural Rule Checking for High-Level Synthesis [p 949]
- J. Gong, C.-T. Chen, and K. Kücükcakar
Verifying an implementation produced from high-level
synthesis is a challenging problem due to many complex
design tasks involved in the design process. In this paper, we
present an architectural rule checking approach for high-level
design verification. This technique detects and locates
various design errors and verifies both the consistency and
correctness of an implementation. Besides describing different
rule suites, we also report a working environment for the
architectural rule checking. Finally, we highlight the value of
the proposed approach with a real-life design.
-
A Unified Technique for PCB/MCM Design by Combining Electromagnetic
Field Analysis with Circuit Simulator [p 951]
- H. Kimura and N. Iyenaga
This paper proposed the unified design technique which
combines electromagnetic field analysis [FDTD technique]
with circuit simulator [HSPICE]. Proposed technique can
analyze the integrated circuits [ICs], multi-chip-module
[MCM], and printed circuit board [PCB] design in high-efficiency
and high-accuracy including the rounding noise
throughout the substrate. Furthermore, this technique can
not only analyze the small signal operation but also large
signal operation.
-
Core Interconnect Testing Hazards [p 953]
- P. Nordholz, H. Grabinski, D. Treytnar, J. Otterstedt,
D. Niggemeyer, U. Arz, and T.W. Williams
The SIA Roadmap [1] predicts a very aggressive path of
technologies from 0.35 um technology design to 0.10 um
technology design. Increasing frequencies together with
decreasing geometries lead to a number of issues which need
to be examined. Testing is clearly one main issue. Another
area of concern is that of signal integrity of the interconnects.
The interconnects must not only be analyzed with regard to
opens and shorts but also with regard to the signal delays.
Up to now, opens and shorts in bus systems on boards have
been tested using boundary scan, mostly neglecting delay test.
In addition, it has to be considered that the signal delay
(i.e. the time when the signal crosses the switching threshold
of the following gate) on a certain line within a bus system
depends on the set of input signals of all bus lines. Furthermore,
hazards can occur due to coupling between bus lines which can
lead to an incorrect function of the whole circuit.
-
Quality Estimation of Test Vectors and Functional Validation Procedures
Based on Fault and Error Models [p 955]
- T. Riesgo, Y. Torroja, E. de la Torre, and J. Uceda
This paper presents a method to estimate the
quality of a set of test vectors and the validation
procedures from pre-synthesised descriptions in VHDL.
The method is based on the definition of fault models, for
test features evaluation, and error models, for quality
validation estimation.
-
Fault Analysis in Networks with Concurrent Error Detection Properties [p 957]
- C. Bolchini, F. Salice, and D. Sciuto
The design of Self-Checking circuits through output encoding
finds a bottleneck in the realization of the network so
that each fault produces only errors detectable by the
adopted code. An analysis of an expected TSC network is
proposed, based on the application of the weighted observability
approach. The aim is the verification of the SC property
of the encoded circuit (TSC fault simulation) and identification
of critical areas for a consequent manipulation to
achieve a complete fault coverage.
-
IOCIMU -- An Integrated Off-Chip IDDQ Measurement Unit [p 959]
- M. Svajda, B. Straka, and H. Manhaeve
The implementation of an Cif-Chip IDDQ monitor to support the test
Cf complex ASICs is presented in this paper. The monitor can be incorporated
into a standard automated test equipment (ATE). It is capable of driving a
2 uF capacitive load can can perform measurements of the IDDQ of
a device under test (DUT) in the 0-1mA range. According to measurements the
monitor can operate at the test rates up to 30kHz and offers an resolution
better than 0.1uA. The on-chip integrated bypass switch is capable of handling
DUT transient currents up to several amps. The IOCIMU prototype was fabricated
in the 2-um Mietec BiCMOS technology and has an active chip area of 20
mm2.
-
Automatic Topology Optimization for Analog Module Generators [p 961]
- M. Wolf and U. Kleine
In this paper a new topology optimization feature of a
module generator environment [5-6] will be presented.
The optimization is performed by removing redundant
elements of objects already placed and by assessing
different layout topologies of a module. This drastically
reduces the length of the generator source code, because
different topologies need no separate source code, but
result automatically.
-
Asynchronous Scheduling and Allocation [p 963]
- A. Prihozhy
This paper presents an approach to generating asynchronous
schedules of various concurrency levels and describes novel
net-based scheduling and allocation optimization techniques
for asynchronous high-level synthesis. The asynchronous schedules
are optimized through the sets of concurrent variable and statement
pairs. Experimental results and a comparison of the net-based
techniques with the best sequential scheduling and allocation
techniques are presented.
-
Path Verification Using Boolean Satisfiability [p 965]
- M. Ringe, T. Lindenkreuz, and E. Barke
The importance of identifying false paths in a combinational
circuit cannot be overstated since they may mask
the true delay. We present a fast algorithm based on
boolean satisfiability for solving this problem. We also
present extensions to this per-path approach to find the
critical path of a circuit in a reasonable time.
-
PowerShake: A Low Power Driven Clustering and Factoring Methodology
for Boolean Expressions [p 967]
- S. Roy, H. Arts, and P. Banerjee
This paper describes algebraic techniques that target low
power consumption. A unique power cost function based on de-composed
factored form representation of a Boolean expression
is introduced to guide the structural transformations. Circuits
synthesized by the SIS [5] and POSE [1] consume 54.5% and
10.4% more power than that obtained by our tool respectively.
-
Power and Timing Modeling for ASIC Designs [p 969]
- W. Roethig, A.M. Zarkesh, and M. Andrews
This paper presents a unified power and timing
modeling for ASIC libraries. This ASIC library is being
standardized and targeted for a design flow, where
timing analysis is complemented by power analysis.
We show benchmark results from new industrial gate-level
power analysis tools.
-
Constraints Space Management for the Layout of Analog IC's [p 971]
- B.G. Arsintescu and R.H.J.M. Otten
An automated technique to narrow down the number of
parameters for linear constraint transformation
models of analog circuits is described. The sets
of more important circuit parameters and specifications are
confined in an efficient constraint transformation model. The
method is based on least square approximation and principal
component analysis of the sensitivity matrix of the transformation.
The resulting model encompass the constraints confined using
designers' expertize for approximated circuit calculations.
-
A Synthesis Procedure for Flexible Logic Functions [p 973]
- I. Pomeranz and S.M. Reddy
In most applications of digital logic circuits, the circuit function
is either specified (0,1) or unspecified (don't-care) for every
input condition. However, there are also applications where any
one of a subset of functions is an acceptable solution, even
though it is not possible to represent all the functions in terms of
output don't-cares. In this case, we say that the function is flexible.
Flexible functions were considered before in [1]. In this
work, we propose a synthesis procedure for flexible functions
based on functional blocks called comparison units [2]. The
main differences between the proposed procedure and the procedures
of [1] are the following. (1) We do not require a closed-form
representation of all the flexibility that exists in specifying
the function f . We only require that a procedure would exist to
check whether a given function belongs to the class of acceptable
functions. (2) We use a specific architecture for the implementation
of flexible functions. This architecture, based on comparison
units [2], is particularly suitable for implementing flexible functions,
since the correspondence between circuit size and certain
properties of the implemented function is strong and easy to utilize
for the minimization of the implementation. The proposed
synthesis procedure starts from an acceptable function f' that
may be used to implement f . It then modifies f' so as to change
certain properties of f' that lead to smaller comparison unit
based implementations. Before any modification of f' is
accepted, a check is made to make sure that the modified function
is an acceptable implementation of f . Modifications are
made as long as it is possible to change the properties of f' that
lead to a reduction in the implementation size.
We also demonstrate that implementations using comparison
units for conventional, non-flexible functions are an effective
intermediate step for synthesis. For this purpose, we apply the
synthesis tool suite SIS from the University of California at
Berkeley in two ways. (1) To a comparison unit implementation
of a function, and (2) directly starting from the truth table of the
function. In most cases, the area of the circuit derived from a
comparison unit based implementation is smaller.
-
Denotational Semantics of a Behavioral Subset of VHDL [p 975]
- F. Nicoli
This paper introduces a denotational semantics of a
behavioral subset of VHDL. This subset is restricted to
basic data types only and does not allow for clauses
in wait statement. We consider the full model of
time and resolution, we give a precise definition of the
simulation mechanism. Easy translation rules from
VHDL to Boyer-Moore logic can be derived from that
semantics.
-
Correct High-Level Synthesis: A Formal Perspective [p 977]
- J.M. Mendías, R. Hermida, and M. Fernández
This paper presents a formal synthesis system which
delegates the design space exploration to non-formal, and
potentially incorrect, high level synthesis tools. With a
quadratic complexity, our system obtains either a truly
correct-by-construction design, since the formal design
process constitutes itself the verification process, or demonstrates
that the solution found by the conventional tool was
incorrect.
-
A Bypass Scheme for Core-Based System Fault Testing [p 979]
- M. Nourani and C. Papachristou
We present a global design for test methodology for testing a
core-based system in its entirety. This is achieved by introducing a
'bypass' mode for each core by which the data can
be transferred from a core input port to the output port without
interfering the core circuitry itself. The interconnections
are thoroughly tested since they are used topropagate test data
(patterns or signatures) in the system. The system is modeled
as a directed weighted graph in which the core accessibility is
solved as a shortest path problem.
-
Highly Testable and Compact 1-out-of-n Code Checker with Single Output
[p 981]
- C. Metra, M. Favalli, and B. Ricco
This paper presents a novel 1-out-of-n checker that,
compared to the other implementations up to now presented,
features the advantages of: i) satisfying the TSC or
SCD property with respect to all possible internal faults representative
of realistic failures; ii) presenting a single output
line; iii) requiring significantly lower area overhead.
-
Design-for-Testability for Synchronous Sequential Circuits Using
Locally Available Lines [p 983]
- I. Pomeranz and S.M. Reddy
We propose a non-scan design-for-testability (DFT) method to
increase the testability of synchronous sequential circuits. Non-scan
DFT allows at-speed testing, as opposed to scan or partial-scan
based DFT that normally leads to low-speed testing and
longer test application times due to scan operations. The proposed
method is based on the identification of several types of
restrictions imposed by the combinational logic of the circuit on
the values that can be assigned to the next-state variables. These
restrictions limit the set of states the circuit can reach, thus limiting
the set of input patterns that can be applied to its combinational
logic during normal operation. This in turn limits the fault
coverage that can be achieved. The proposed DFT procedure is
different from other non-scan based DFT procedures [1], [2] in
that it relies on lines available locally to drive the inserted DFT
logic, avoiding the routing of primary input lines to the flip-flops,
and the routing of internal lines to the primary outputs.
The proposed scheme uses the complement value Y of a next
state variable Y or the value of an adjacent state variable Y in
order to change the value of Y, and thus enrich the set of states
that can be reached by the circuit.
The proposed approach considers several special cases
that result in unreachable states (or states that cannot be easily
reached) to determine where the DFT logic will be placed. We
consider cases where a next-state variable always (or almost
always) carries a single value under a random sequence of input
vectors, and cases where two next-state variables carry the same
values, or complemented values. These cases have a drastic
effect on the set of state variable patterns that can be applied to
the combinational logic of the circuit in practical time, thus limiting
its testability.
-
CMOS Combinational Circuit Sizing by Stage-Wise Tapering [p 985]
- S. Pullela, R. Panda, A. Dharchoudhury, G. Vijayan, and D. Blaauw
We describe a fast (linear time) procedure to optimally size
transistors in a chain of multi-input gates/stages. The fast sizing used in a
simultaneous sizing and restructuring optimization procedure, to accurately
predict relative optimal performance alternative circuit structures for a
given total area. The idea extends the concept of optimally sizing a buffer
chain[5], and uses tapering constants based on the position of a stage in a
circuit, and the position of a transistor in a stack.
-
Fault Detection for Linear Analog Circuits Using Current Injection [p 987]
- J. Velasco-Medina, T. Calin, and M. Nicolaidis
A new test technique for linear analog circuits which
employs current injection as input test stimulus is
described. Our investigations have shown that current
transitions resulting from a current injected on internal
test points are significantly different for the fault free
and faulty circuits. This can be used for fault detection
purposes. In fact, the current injection as test input
stimulus represents a powerful alternative to the test
approaches based on conventional voltage input stimulus.
The new approach allows to improve the testability of
various faults, which are difficult to detect or are
untestable when using voltage-based test stimulus. In
addition the technique has significant advantages for
BIST testing purposes. The technique is illustrated by
means of a modern opamp circuit and by considering
catastrophic and gate-oxide-short (GOS) faults.
|