# Coupling Delay Optimization by Temporal Decorrelation using Dual Threshold Voltage Technique

Ki-Wook Kim<sup>1</sup>, Seong-Ook Jung<sup>2</sup>, Prashant Saxena<sup>3</sup>, C. L. Liu<sup>4</sup> and Sung-Mo Kang<sup>5</sup>

<sup>1</sup> Pluris, Incorporation, Cupertino, California

<sup>2</sup> Dept. of Electrical Engineering, University of Illinois at Urbana-Champaign
 <sup>3</sup> Intel Corporation, Hillsboro, Oregon

<sup>4</sup> Dept. of Computer Science, National Tsing Hua University, Taiwan

<sup>5</sup> Dept. of Computer Engineering, University of California at Santa Cruz

# ABSTRACT

Coupling effect due to line-to-line capacitance is of serious concern in timing analysis of circuits in ultra deep submicron CMOS technology. Often coupling delay is strongly dependent on temporal correlation of signal switching in relevant wires. Temporal decorrelation by shifting timing window can alleviate performance degradation induced by tight coupling. This paper presents an algorithm for minimizing circuit delay through timing window modulation in dual  $V_t$  technology. Experimental results on the ISCAS85 benchmark circuits indicate that the critical delay will be reduced significantly when low  $V_t$  is applied properly.

## Keywords

Coupling, Performance, Leakage power

## 1. INTRODUCTION

Coupling effect is becoming a major concern in high performance circuit design in ultra deep sub-micron (UDSM) technology. Coupling capacitance accounts for as much as 70% of the total wire capacitance for adjacent wires with minimum spacing [7]. Such coupling capacitance leads to significant deviations between actual and nominal timing responses, power consumptions and functional behaviors.

There has been much research on analyzing a crosstalk effect on waveforms, peak voltage and delay variation. Effective coupling capacitance is characterized by using a lumped RC model [5], a quadratic model [2] or a general model considering fringing effect [8]. As coupling capacitance becomes the dominant part of the total wire load capacitance in UDSM technology, the intercon-

Copyright 2001 ACM 1-58113-297-2/01/0006 ...\$5.00.



Figure 1: Coupling delay reduction in a critical net by timing window shifting in the potential aggressor net.

nect delay can vary by several hundred percents depending on the switching activity of nearby lines [8]. Delay variation due to coupling effect has been studied in CTX [14], TACO for static timing analysis [1], waveform iteration strategy [9] and dynamically bounded delay model [10]. Coupling-aware delay optimization aims to reduce critical path delays by mitigating coupling effect in critical nets. This paper proposes a method to reduce signal interactions so that coupling delay in critical path can be minimized.

The example in Figure 1 illustrates how critical path delay can be shortened by manipulating the timing windows. Suppose that there are a critical net v driven by a weak driver and a potential aggressor net a driven by a strong driver. The timing window represents the earliest and latest transition times, where the unshaded region is the delay due to intrinsic gate and interconnection and the shaded region accounts for the delay variation due to coupling. Coupling delay in net v is reduced when the timing window of the potential aggressor net a is shifted from  $\langle ta_1, ta_2 \rangle$  to  $\langle ta'_1, ta'_2 \rangle$ . The latest coincident switching time for the two nets changes to  $ta'_2$  after accelerating the signal transitions in net a. Now that maximum coupling takes place at  $ta'_2$ , coupling delay in the critical net v de-

<sup>\*</sup>Supported in part by a grant from National Science Foundation under NSF MIP 96 12184, and by a grant from Intel Corporation under Grant 5414.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

DAC 2001, June 18-22, 2001, Las Vegas, Nevada, USA.

creases by  $\Delta \tau_v$  to tv'. To sum up, speedup in the non-critical path leads to delay reduction in the critical path. To engineer timing window of a net, we use the dual threshold voltage technique.

# 2. DUAL THRESHOLD VOLTAGE TECH-NIQUE

Dual threshold voltage ( $V_t$ ) technology has been developed to improve performance with two different threshold voltages, namely high-threshold and low-threshold [6]. Low  $V_t$  devices switch faster than their high  $V_t$  counterparts. Thus, low  $V_t$  devices can be used to meet tight delay budget. However, low  $V_t$  devices are more sensitive to noise sources because the logic trip point is lower than high  $V_t$  devices. When the gate voltage is below the threshold voltage in weak-inversion regime, a diffusion current flows from drain to source when drain-to-source voltage is non-zero. This current is called subthreshold current that depends exponentially on threshold voltage. The subthreshold current of transistors is a major source of leakage power consumption in sleep mode. The leakage power of a circuit is represented by the sum of the leakage power consumption by all transistors such that

$$P_{leak} = \sum_{u \in V} I_{sub} \cdot V_{DS}$$

where  $I_{sub}$  is the subthreshold leakage current and  $V_{DS}$  is the drainsource voltage of the transistor. One can model the subthreshold leakage current of a MOSFET from BSIM MOS transistor model [13] by

$$I_{sub} = \mu_0 C_{ox} \frac{W_{eff}}{L_{eff}} \left(\frac{kT}{q}\right)^2 e^{1.8} e^{\frac{q}{n'kT}(V_{GS} - V_t)} \left(1 - e^{\frac{-qV_{DS}}{kT}}\right)$$

where  $C_{ox}$  is the gate oxide capacitance per unit area,  $W_{eff}$  and  $L_{eff}$  are effective channel width and effective channel length, respectively,  $\mu_0$  is the zero bias mobility, n' is the subthreshold swing coefficient of the transistor, kT is the thermal energy, q is the magnitude of electronic charge, and  $V_{GS}$  is the gate-source voltage of the transistor.

Therefore, low  $V_t$  devices should be appropriately deployed in order to maximize performance while minimizing significant power and noise sensitivity penalties. Backward breadth first search heuristics are employed to minimize leakage power dissipation by applying high  $V_t$  to gates off the critical paths in [16]. In order to identify a maximal feasible subset of gates that can be replaced by high threshold voltage without increasing circuit delay, a circuit graph enumeration technique was proposed in [15]. Slack equalization algorithm is used to maximize power reduction based on maximal independent set in [3].

Most of the leakage power optimization techniques are based on static timing analysis to compute slack information. Static timing analysis employs path-centric methodology where the worst case delay of all possible paths in a circuit is carefully investigated. However, conventional static timing analysis is limited to the path delay computation only, thus the impact of coupling on delay variation is not taken into account.

This paper presents an algorithm to determine that low  $V_t$  can be applied to devices in order to minimize the longest path delay based on coupling-aware timing analysis. Initially, all the transistors in a circuit have high  $V_t$  configuration to minimize leakage power consumption associated with low  $V_t$  devices. We apply low  $V_t$  not only to gates in the critical path but also to gates off the critical path that affect the circuit delay through coupling effect. Thus the critical path is reduced by using the approach as shown in Figure 1. In summary, dual  $V_t$  technique is used to shift the timing window so that temporal correlations in the critical nets are minimized to optimize the circuit performance.

First, coupling effect on delay is analyzed with respect to the relative timing windows. Then, we propose an algorithm to optimize performance by using low  $V_t$  devices.

# 3. COUPLING-INDUCED DELAY CHARAC-TERIZATION

#### 3.1 Coupling-Aware Static Timing Analysis

To analyze the temporal behavior of a combinational circuit, we employ two graphs: a circuit graph *G* and an adjacency graph  $G_a$ . A combinational circuit is represented by a directed acyclic graph G = (V, E). A node  $v \in V$  represents a gate in the circuit. There is an edge  $(v, q) \in E$  if the output signal of node v is fed directly as an input of node q. A node v in a circuit graph *G* is weighted by the intrinsic gate delay d(v).

The adjacency graph  $G_a = (V_a, E_a)$  is an undirected graph which represents physical neighborhood of relevant wires when there is finite coupling capacitance between the wires. A node  $n_{vq} \in V_a$ represents an edge  $(v, q) \in E$  in the circuit graph, namely, a wire connecting gate v and gate q. There is an edge  $(n_{vq}, n_{ar}) \in E_a$  if the wire connecting gate v and q is physically adjacent to the wire connecting gate a and r with a certain coupling capacitance. Wire delay w(v,q) is assigned to the node  $n_{vq} \in V_a$  in the adjacency graph.

Wire delay accounts for both intrinsic wire delay  $\omega_{vq}$  due to line-to-substrate capacitance and coupling delay, namely speedup  $\kappa(n_{vq})$  or slowdown  $\tau(n_{vq})$ , due to line-to-line capacitance. We define maximum wire delay  $w_{\max}(v,q)$  and minimum wire delay  $w_{\min}(v,q)$  such that

$$w_{\max}(v,q) = \omega_{vq} + |\tau_{\max}(n_{vq})|,$$
  

$$w_{\min}(v,q) = \omega_{vq} - |\kappa_{\max}(n_{vq})|$$

where  $\tau_{max}$  ( $\kappa_{max}$ ) is the maximum slowdown (speedup) due to coupling, referred to as *maximum coupling-induced slowdown* (*speedup*), which will be defined later. We assume that intrinsic gate delay and intrinsic wire delay have been predetermined.

We define arrival time, required time and slack time with reference to clock edge as follows.

*Definition 1.* The arrival timing window  $\langle AT_{\min}(v), AT_{\max}(v) \rangle$  for node *v* in a circuit graph defines the earliest path delay and the latest path delay from the primary inputs to *v* such that

$$AT_{\min}(v) = \min_{i \in FI(v)} (AT_{\min}(i) + w_{\min}(i, v) + d_i(v))$$
  
$$AT_{\max}(v) = \max_{i \in FI(v)} (AT_{\max}(i) + w_{\max}(i, v) + d_i(v))$$

where FI(v) is a set of fanin nodes of node v and  $d_i(v)$  is the pin-topin gate delay from the input *i* of gate v to the output of gate v. The required time RT(v) is the latest time the signal must be available at the output of v such that

$$RT(v) = \min_{j \in FO(v)} (RT(j) - w_{\max}(v, j) - d_v(j)),$$

where FO(v) is a set of fanout nodes of node v. The slack time ST(v) is defined as the time gap between required time and maximum arrival time such that

$$ST(v) = RT(v) - AT_{\max}(v).$$

*Definition 2.* If all the nodes in a circuit have non-negative slack time, then the circuit is said to be safe.



Figure 2: Equivalent circuit for coupling delay analysis.

#### 3.2 Coupling-Induced Delay Analysis

We define time skew as relative arrival time between two nodes.

*Definition 3.* The time skew window  $\langle z_{\min}(v, a), z_{\max}(v, a) \rangle$  for node *v* and node *a* defines minimal and maximal difference in the arrival time of node *v* with respect to that of node *a* as in [11] such that

$$z_{\min}(v,a) = AT_{\min}(v) - AT_{\max}(a),$$
  

$$z_{\max}(v,a) = AT_{\max}(v) - AT_{\min}(a).$$

Slowdown of signal switching in wire (v, q) due to coupling from wire (a, r) is denoted  $\tau(n_{vq}|n_{ar})$ , while  $\kappa(n_{vq}|n_{ar})$  denotes coupling-induced speedup.

Coupling-induced delay is analyzed based on the configuration shown in Figure 2. Input signals are in exponential forms. With various time skew *z*, voltage characteristic of the victim node is represented by [4]

$$V(t,z) = V_{exp}(t,z) - \frac{U(t-z)}{x} \cdot \left(\frac{b}{(w+\frac{1}{x})\cdot(w-u)} \cdot e^{w(t-z)} + \frac{b}{(u+\frac{1}{x})\cdot(u-w)} \cdot e^{u(t-z)} + \frac{b}{(u+\frac{1}{x})\cdot(w+\frac{1}{x})} \cdot e^{\frac{z-t}{x}}\right)$$

where  $V_{exp}(t,z)$  is voltage characteristic of victim node without signal interaction, U(t) is a unit step function, x and y are input waveform constants,  $b = C_m/(R_aC_t)$ ,  $f = (C_m + C_v)/(R_aC_t)$ ,  $C_t = C_mC_a + C_mC_v + C_aC_v$ , and u and w are solutions of impedance characteristic function

$$s^{2} + s\left(\frac{R_{a}(C_{m} + C_{a}) + R_{v}(C_{m} + C_{v})}{R_{a}R_{v}C_{t}}\right) + \frac{1}{R_{a}R_{v}C_{t}} = 0$$

Then, coupling-induced delay is defined as

$$\tau(z) = t(V = 0.5V_{dd}) - t(V_{exp} = 0.5V_{dd})$$

Figure 3 shows coupling-induced slowdown with various time skew window z. The ratio between the coupling capacitance and the parallel-plate capacitance is set to four in  $0.18\mu$ m technology. There are several important observations. Maximum coupling-induced slowdown happens when the signal in the aggressor node switches almost at the same time as the transition in the victim node. Coupling effect decreases gradually with lagging aggressor signal switching, while decreases sharply with leading aggressor signal switching. Coupling-induced slowdown is so significant that peak couplinginduced slowdown of 43ps is comparable to the intrinsic delay of 50ps without any coupling.

The maximum slowdown value is determined by both the couplinginduced slowdown function and the time skew window. Based on the HSPICE simulation in Figure 3, we assume that the couplinginduced delay over the time skew is uni-modal. A binary semaphore  $\alpha$  denotes whether a given time skew window includes a time skew



Figure 3: Delay variation associated with relative arrival time, with  $C_a = C_v$  and  $C_m/C_a = 4$ . The turn-on resistance  $R_a$  and  $R_v$  for drivers are determined such that the rising time or the falling time are set to 50ps without coupling effect.

 $z_M$  corresponding to maximum coupling-induced slowdown such that

$$\alpha = \begin{cases} 1 & z_{\min} \le z_M \text{ and } z_{\max} \ge z_M \\ 0 & \text{otherwise} \end{cases}$$

Now suppose that two wires, (v, q) and (a, r), have certain crosscoupling such that  $(n_{vq}, n_{ar}) \in E_a$ .

Definition 4. Maximum coupling-induced slowdown  $\tau_{max}$  ( $n_{vq}|n_{ar}$ ) for a signal in wire (v,q) due to signal switching in wire (a,r) over a time skew window  $< z_{\min}(v,a), z_{\max}(v,a) >$  is defined

$$\tau_{\max}(n_{vq}|n_{ar}) = \max[\tau(z_{\min}(v,a)), \ \tau(z_{\max}(v,a)), \ \alpha\tau_M(v,a)]$$

where  $\tau_M$  is the maximum delay degradation which is predetermined by physical, electrical and topological factors of effective coupling capacitance.

Figure 4 shows examples of timing windows and their associated coupling delay. There are two nets neighboring each other. Suppose that net v is a critical net, and its slack time is zero. Figure 4(a) illustrates that the two timing windows are overlapped and  $\alpha = 1$ , namely  $z_{\min}$  is less than  $z_M$  while  $z_{\max}$  is greater than  $z_M$ . Because the relative window spectrum includes  $z_M$ , maximum coupling delay in critical net v is  $\tau_M$ . Meanwhile, Figure 4(b) shows a left-shifted timing window of non-critical net a. Associated time skew window results in  $\alpha = 0$ . Thus the maximum coupling delay in critical net v is now  $\tau(z_{\max})$  which is smaller than  $\tau_M$ .

THEOREM 1. With an assumption that coupling effect on timing due to multiple aggressors is linearly additive<sup>1</sup>, maximum couplinginduced slowdown in wire (v, q) is bounded by

$$egin{aligned} &\max_{(n_{vq},n_{ar})\in E_a} au_{\max}(n_{vq}|n_{ar}) \leq au_{\max}(n_{vq}) \ & au_{\max}(n_{vq}) \leq \sum_{(n_{vq},n_{ar})\in E_a} au_{\max}(n_{vq}|n_{ar}) \end{aligned}$$

<sup>&</sup>lt;sup>1</sup>In practice, the coupling-induced slowdown from multiple aggressors can be represented by a supra-linear function, which can be readily applied to this theorem.



Figure 4: Timing windows and coupling delay : (a) maximum coupling delay =  $\tau_M$  with  $\alpha = 1$ , (b) maximum coupling delay =  $\tau(z_{max})$  with  $\alpha = 0$ .

where the signal in wire (a, r) is temporally correlated to the signal in wire (v, q) which is represented by  $(n_{vq}, n_{ar}) \in E_a$ .

We adopt a conservative upper limit to compute max-delay. Similarly, maximum coupling-induced speedup  $\kappa_{max}$  is resolved to compute minimum delay.

To guarantee the safety of a circuit after delay variation, effective delay variation in both the victim node and the transitive fanout nodes should be within their slack time. Upon decreasing the gate delay in the aggressor node, the neighbor node is insensitive to coupling if coupling delay variation  $\Delta \tau = 0$ . If  $\Delta \tau = 0$ , then two nodes are temporally exclusive to each other. In other words, such nodes are immune to crosstalk due to temporal independence. Negative coupling delay variation is favorable for performance improvement. Particularly, if a neighbor node is in a critical path of which slack time is less than or equal to zero, effective delay reduction can speed up the circuit. It is noteworthy that the longest path delay in a circuit can be shortened by augmenting a specific delay component in a part of the circuit. This is true because the mitigation of effective coupling by reducing the temporal correlation between signals can accelerate signal transitions in the critical path.

## 4. PERFORMANCE OPTIMIZATION

The main idea for performance improvement is to minimize the coupling effect in critical nets from adjacent non-critical nets. We achieve our goal of coupling minimization by mitigating temporal correlation between critical nets and potential aggressor nets. Temporal correlation is represented by the time skew window. If inertial gate delay decreases by lowering the threshold voltage, timing window can be shifted accordingly. Minimization of timing-window overlaps can lead to temporal decorrelation and to reduction in coupling delay eventually.

To formulate the problem of performance optimization, we assume that gate u has delay  $d_h(u)$  for a high threshold device and delay  $d_l(u)$  for a low threshold device. The inertial delay differ-

ence is referred to as *delay gap* denoted  $d_{hl}(u) \equiv d_h(u) - d_l(u)$ .

Initially, all the gates in the circuit are configured with high threshold devices to have minimum leakage power consumption. Let  $N_f$  denote a set of *feasible* nodes with high threshold devices. Gates corresponding to feasible nodes can be switched to low threshold devices. Our problem is to identify a subset of feasible nodes such that threshold switching maximizes the minimum slack time of the circuit by minimizing the coupling delay in the critical paths.

For each feasible node *u*, the performance sensitivity  $\sigma(u)$  represents the amount of reduction in critical path delay as a result of low *V*<sub>t</sub> application which is formally defined as

$$\sigma(u) = \Delta ST_{\min}|_{d_{hl}(u)}$$

where  $\Delta ST_{\min}$  denotes the variation in minimum slack time of a circuit.

Given a combinational circuit, a circuit graph *G* is built. We assume that the given circuit has already been placed and routed, thus concrete information for computing coupling effect between wires is available and is represented by an adjacency graph  $G_a$ . For each iteration, we identify a candidate set  $N'_f \subseteq N_f$  associated with positive performance sensitivity.

Only a subset of the candidate node set can be simultaneously switched to low threshold voltage, because delay variation of certain nodes depends on other nodes. Identifying such a subset of candidate nodes which can be processed at the same time is accomplished using a sensitivity graph. A sensitivity graph  $G_s$  is an undirected graph  $G_s = (V, E_s)$  where a node corresponds to a gate in the combinational circuit. Suppose that there exists a node w such that timing variation in node u affects the timing characteristics of the node w through coupling. If the node w is also sensitive to timing change in node v, then there is an edge  $(u, v) \in E_s$ . An edge (u, v) in a sensitivity graph implies that the two nodes u and v are temporally correlated. To reflect the coupling effect on relative timing window accurately, even though it is somewhat conservative strategy, we switch threshold voltages at a time for nodes which are temporally independent. Weighted by performance sensitivity, a set of nodes is identified from the sensitivity graph  $G_s$  such that the weighted sum of the nodes selected is maximum. The threshold voltages for the selected nodes are updated to low. Delay and slack updates are carried out for the timing window sensitive node set and slack time sensitive node set, respectively. The procedure terminates when there is no gain in performance improvement.

## 5. EXPERIMENTAL RESULTS

The proposed algorithm for coupling-aware performance optimization has been implemented and tested for the ISCAS85 benchmark circuits. The synthesis flow is shown in Figure 5. Each circuit netlist given by a truth table format (BLIF format) was processed based on technology independent optimization, then mapped using the delay optimal option of SIS [12]. After placement and routing have been carried out, accurate coupling information is available for the estimation of circuit delay using a timing analysis tool. In practice, threshold voltage switching can be realized by tagging the devices for low  $V_t$  process. Thus, extracted physical configuration such as net spacing between adjacent nets remains unchanged through low  $V_t$  application.

Table 1 shows the normalized values for the delay time and subthreshold leakage current of the low  $V_t$  transistors with respect to the high  $V_t$  transistors. Our experiment is based on these characteristics. It needs to be noted that if all the transistors are changed into low  $V_t$  devices, maximum circuit delay decrease is limited to 25% according to Table 1.



Figure 5: Performance-driven synthesis flow.

Table 1: Delay and leakage current for low and high threshold voltage transistors (both delay and leakage current values are normalized with respect to high  $V_t$  transistors).

| Transistor type | Delay time | Sub-threshold leakage current |  |  |
|-----------------|------------|-------------------------------|--|--|
| High $V_t$      | 1.00       | 1.00                          |  |  |
| Low $V_t$       | 0.85       | 4.10                          |  |  |

Table 2 shows the relative leakage power consumption and circuit delay for the ISCAS85 benchmark circuits. Initially, all the circuits are implemented using high  $V_t$  devices only. Results for power and delay shown in Table 2 are percentage values with respect to those of the initial circuits. The power values account for the total power consumption including dynamic power and leakage power. The percentage of low  $V_t$  transistors after optimization is shown in the column "Low  $V_t$ " and the results in column "CPU" are computation time in seconds.

There are two sets of results. The results in column "Dual  $V_t$ " correspond to circuit realizations with low  $V_t$  devices in the critical paths only. During  $V_t$  substitution, coupling delay is not considered as in conventional approaches. For example, the circuit C1355 has longer critical path than all high  $V_t$  configuration, because gate delay reduction ignoring coupling effect results in stronger temporal correlation between critical nets and adjacent nets.

Another set of results in column "Coupling-aware dual  $V_t$ " corresponds to the new approach we proposed. For each candidate gate for performance improvement, it is guaranteed that  $V_t$  substitution does not incur circuit delay increase via coupling. Thus, some of gates with zero slack time are not converted to low  $V_t$  devices while some of the gates adjacent to critical nets are switched to low  $V_t$  devices. On the average, delay reduction is about 12.6 % compared to all high  $V_t$  configuration.

# 6. CONCLUSION

Secondary effect on delay due to coupling becomes an important issue in ultra deep submicron technology. As coupling capacitance increases, coupling delay takes a significant portion of chip delay. Coupling delay is strongly dependent on temporal correlation of signal switching in relevant wires. Temporal decorrelation using dual  $V_t$  technology was proposed in this paper. Low  $V_t$  devices are faster than high  $V_t$  devices, but have higher leakage current in standby mode. Appropriate threshold switching in gates in both

critical path and non-critical path is carried out to minimize coupling delay in critical path. We proposed a new algorithm to contain coupling delay by using timing window shifting technique. Experimental results for ISCAS85 benchmark circuits show that proper low  $V_t$  application reduces the circuit delay significantly.

In this paper, our approach focused on maximum circuit delay. However, minimum delay for hold time constraint becomes important in high performance circuit design. Our algorithm can be modified and readily applied to maximize the minimum delay.

## 7. ACKNOWLEDGMENTS

The authors thank Priyadarshan Patra, Timothy Kam and Steve Burns with Intel Corporation for technical discussion and suggestions.

#### 8. **REFERENCES**

- R. Arunachalam, K. Rajagopal, and L. T. Pileggi. TACO: Timing analysis with coupling. In *Proc. ACM/IEEE Design Automation Conf.*, pages 266–269, 2000.
- [2] M. Bohr. Interconnect scaling: The real limiter to high performance ULSI. In *IEEE Int. Electronic Device Meeting*, pages 241–244, 1995.
- [3] C. Chen and M. Sarrafzadeh. Slack equalization algorithm: Precise slack distribution for low-level synthesis and optimization. In *Proc. Int. Workshop Logic Synthesis*, pages 1–3, 1999.
- [4] W. Chen, S. K. Gupta, and M. A. Breuer. Analytic models for crosstalk delay and pulse analysis under non-ideal inputs. In *Proc. Int. Test Conf.*, pages 809–818, 1997.
- [5] W. Chen, S. K. Gupta, and M. A. Breuer. Test generation for crosstalk-induced delay in integrated circuits. In *Proc. Int. Test Conf.*, pages 191–200, 1999.
- [6] Z. Chen, C. Diaz, J. Plummer, M. Cao, and W. Greene. 0.18-μm dual Vt MOSFET process and energy-delay measurement. In *Int. Electron Devices Meeting*, pages 851–854, 1996.
- [7] J. Cong. Challenges and opportunities for design innovations in nanometer technologies. In SRC Design Science Concept, 1997.
- [8] F. Dartu and L. T. Pileggi. Calculating worst-case gate delays due to dominant capacitance coupling. In *Proc. ACM/IEEE Design Automation Conf.*, pages 46–51, June 1997.
- [9] P. D. Gross, R. Arunachalam, K. Rajagopal, and L. T. Pileggi. Determination of worst-case aggressor alignment for delay calculation. In *Proc. IEEE/ACM Int. Conf. Computer Aided Design*, pages 212–219, Nov. 1998.
- [10] S. Hassoun. Critical path analysis using a dynamically bounded delay model. In *Proc. ACM/IEEE Design Automation Conf.*, pages 260–265, 2000.
- [11] Y. Sasaki and G. De Micheli. Crosstalk delay analysis using relative window method. In *Proc. IEEE ASIC/SOC Conf.*, pages 9–13, 1999.
- [12] E. M. Sentovich, J. K. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj, P. R. Stephan, R. K. Brayton, and A. L. Sangiovanni-Vincentelli. SIS: A system for sequential circuit synthesis. Technical Report UCB/ERL M92/41, Electronics Research Laboratory, College of Engineering, University of California, Berkeley, May 1992.
- [13] B. J. Sheu, D. L. Scharfetter, P. K. Ko, and M. C. Teng. BSIM: Berkeley short-channel IGFET model for MOS transistors. *IEEE Journal of Solid State Circuits*, 22(4):558–566, 1987.

| Circuit | Dual V <sub>t</sub> |       | Coupling-aware Dual $V_t$ |       |       | CPU       |       |
|---------|---------------------|-------|---------------------------|-------|-------|-----------|-------|
| Circuit | Power               | Delay | Low $V_t$                 | Power | Delay | Low $V_t$ | (sec) |
| C432    | 103.5               | 83.3  | 17.4                      | 107.3 | 75.3  | 30.5      | 0.4   |
| C880    | 100.6               | 96.5  | 2.5                       | 102.7 | 86.6  | 10.1      | 2.6   |
| C1355   | 100.0               | 100.2 | 0.8                       | 100.3 | 95.9  | 1.8       | 1.6   |
| C1908   | 100.7               | 96.1  | 2.3                       | 102.4 | 87.0  | 8.7       | 6.9   |
| C2670   | 100.9               | 83.7  | 3.0                       | 101.1 | 81.6  | 3.8       | 3.1   |
| C3540   | 100.6               | 92.8  | 1.9                       | 101.1 | 88.4  | 3.8       | 56.8  |
| C5315   | 100.1               | 97.7  | 0.4                       | 100.6 | 91.1  | 2.0       | 112.0 |
| C6288   | 100.1               | 99.0  | 0.4                       | 100.3 | 96.2  | 1.2       | 679.8 |
| C7552   | 100.3               | 89.0  | 1.2                       | 100.5 | 86.4  | 1.9       | 172.3 |
| Avg.    | 100.7               | 92.9  | 1.3                       | 101.8 | 87.4  | 4.1       | •     |

Table 2: Performance optimization results using dual threshold voltage.

- [14] P. F. Tehrani, S. W. Chyou, and U. Ekambaram. Deep sub-micron timing analysis in presence of crosstalk. In *Proc. Int. Symp. Quality Electronic Design*, pages 505–512, 2000.
- [15] Q. Wang and B. K. Vrudhula. Static power optimization of deep submicron CMOS circuits for dual Vt technology. In *Proc. IEEE/ACM Int. Conf. Computer Aided Design*, pages 490–496, Nov. 1998.
- [16] L. Wei, Z. Chen, M. Johnson, K. Roy, and V. De. Design and optimization of low voltage high performance dual threshold CMOS circuits. In *Proc. ACM/IEEE Design Automation Conf.*, pages 489–494, 1998.