Title | Improving XOR-Dominated Circuits by Exploiting Dependencies between Operands |
Author | *Ajay K. Verma, Paolo Ienne (Ecole Polytechnique Federale de Lausanne, Switzerland) |
Page | pp. 601 - 608 |
Keyword | Logic Synthesis, Selective Expansion, XOR-Dominated, Parallel Multiplier |
Abstract | Logic synthesis has made impressive progress in recent times, pervading digital design and replacing universally manual techniques. A remarkable exception is computer arithmetic, an example being
multiple additions performed in carry-save form: column-compressors
are usually built exploiting circuit regularity and are hardly optimised further, due to the large number of XOR operations. We show a general technique to optimise XOR-dominated circuits by exploiting the dependencies among the XOR operands and, demonstrate its effectiveness on multiplier-like circuits. We show that it optimises significantly, the best parallel multipliers by exploiting complex dependencies between the addenda which escape known manual optimisations. |
PDF file |
Title | Optimum Prefix Adders in a Comprehensive Area, Timing and Power Design Space |
Author | Jianhua Liu, Yi Zhu, Haikun Zhu (University of California, San Diego, United States), John Lillis (University of Illinois at Chicago, United States), *Chung-Kuan Cheng (University of California, San Diego, United States) |
Page | pp. 609 - 615 |
Keyword | low power, physical synthesis, prefix addition |
Abstract | Parallel prefix adder is the most flexible and widelyused
binary adder for ASIC designs. Many high-level synthesis
techniques have been developed to find optimal prefix structures
for specific applications. However, the gap between these
techniques and back-end designs is increasingly large. In this
paper, we propose an integer linear programming method to
build minimal-power prefix adders within given timing and area
constraints. It counts both gate and wire capacitances in the
timing and power models, considers static and dynamic power
consumptions, and can handle gate sizing and buffer insertion to
improve the performance further. The proposed method is also
adaptive for non-uniform arrival time and required time on each
bit position. Therefore our method produces the optimum prefix
adder for realistic constraints. |
PDF file |
Title | An Interconnect-Centric Approach to Cyclic Shifter Design Using Fanout Splitting and Cell Order Optimization |
Author | Haikun Zhu, Yi Zhu, *Chung-Kuan Cheng (University of California, San Diego, United States), David M. Harris (Harvey Mudd Colledge, United States) |
Page | pp. 616 - 621 |
Keyword | cyclic shifter, interconnect, fanout splitting, permutation, integer linear programming |
Abstract | We propose two orthogonal approaches to logarithmic cyclic shifter design. The first method, called fanout splitting, replaces multiplexers in a conventional design with demultiplexers which have two fanouts driving the shifting and non-shifting paths separately. The use of demultiplexers has a two-fold effect; it cuts the accumulated wire load on the critical path from $O(N\log_2(N))$ to $O(N)$, and reduces the switching probabilities on the inter-stage long wires from 1/4 to 3/16. We then perform cell order optimization to further improve the delay, and formulate it as an integer linear programming problem. For the 64-bit case, the two approaches together reduce the total delay by 67.1% and dynamic power consumption by 17.6%, respectively. |
PDF file |
Title | Optimization of Robust Asynchronous Circuits by Local Input Completeness Relaxation |
Author | *Cheoljoo Jeong, Steven M. Nowick (Columbia University, United States) |
Page | pp. 622 - 627 |
Keyword | asynchronous circuits, input completeness, dual-rail encoding, relaxation |
Abstract | As process, temperature and voltage variations become significant
in deep submicron design, timing closure becomes a critical
challenge using synchronous CAD flows. One attractive alternative
is to use robust asynchronous circuits which gracefully accommodate
timing discrepancies. However, there is currently little CAD
support for such robust methodologies. In this paper, optimization
algorithms for a class of highly-robust asynchronous circuits are
presented. Though the considered circuit style is robust to timing
variation, it suffers from high area overhead inherent in the style.
The proposed algorithm optimizes area and delay of these circuits
by relaxing their overly-restrictive style. The algorithm was implemented and experimented with MCNC circuits, achieving significant
improvement while still preserving the same robustness property
of the circuit. On average, 49.2% of the gates of the circuits
could be implemented in a relaxed manner and, as a result, 34.9%
area improvement was achieved, and 16.1% delay improvement was
achieved using a simple heuristic for targeting the critical path in
the circuit. This is the first proposed approach that systematically
optimizes circuits based on the notion of local relaxation: still preserving the circuit's overall timing-robustness. |
PDF file |
Title | Safe Delay Optimization for Physical Synthesis |
Author | *Kai-hui Chang, Igor L. Markov, Valeria Bertacco (University of Michigan at Ann Arbor, United States) |
Page | pp. 628 - 633 |
Keyword | physical synthesis, delay optimization, safe |
Abstract | Physical synthesis is a relatively young field in Electronic Design Automation. Many published optimizations for physical synthesis end up hurting the final result, often by neglecting important physical aspects of the layout, such as long wires or routing congestion. In this work we propose SafeResynth, a safe resynthesis technique, which provides immediately-measurable delay improvement without altering the design's functionality. It can enhance circuit timing without detrimental effects on route length and congestion. When applied to IWLS'05 benchmarks, SafeResynth improves circuit delay by 11% on average after routing, while increasing route length and via count by less than 0.2%. Our resynthesis can also be used in an unsafe mode, akin to more traditional physical synthesis algorithms popular in commercial tools. Applied together, our safe and unsafe transformations achieve 24% average delay improvement for seven large benchmarks from the OpenCores suite. The relative contribution of safe and unsafe techniques varies depending on the amount of whitespace in the layout. |
PDF file |