# Time-domain Analysis Methodology for Large-scale RLC Circuits and Its Applications<sup>\*</sup>

Zuying Luo<sup>®, ®</sup>, Yici Cai<sup>®</sup>, Sheldon X.-D Tan<sup>®</sup>, Xianlong Hong<sup>®</sup>, Xiaoyi Wang<sup>®</sup>, Zhu Pan<sup>®</sup>, Jingjing Fu<sup>®</sup>

(①Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, P.R.China; ②College of Information Science and Technology, Beijing Normal University, Beijing, 100875, P.R.China; ③Department of Electrical Engineering, University of California at Riverside, Riverside CA, USA)

Abstract: With soaring work frequency and decreasing feature sizes, VLSI circuits with RLC parasitic components are more like analog circuits and should be carefully analyzed in physical design. However, the number of extracted RLC components is typically too large to be analyzed efficiently by using present analog circuit simulators like SPICE. In order to speedup the simulations without error penalty, this paper proposes a novel methodology to compress the time-descritized circuits resulted from numerical integration approximation at every time step. The main contribution of the methodology is the efficient structure-level compression of DC circuits containing many current sources, which is an important complement for present circuit analysis theory. The methodology consists of the following parts: 1. An approach is proposed to delete all intermediate nodes of RL branches. 2. An efficient approach is proposed to compress and back-solve parallel and serial branches so that they are error-free and of linear complexity to analyze circuits of tree topology. 3. The Y to  $\pi$  transformation method is used to error-free reduce and back-solve the intermediate nodes of ladder circuits with the linear complexity. Thus, the whole simulation method is very accurate and of linear complexity to analyze circuits of chain topology. Based on the methodology, we propose several novel algorithms for efficiently solving RLC-model transient power/ground (P/G) networks. Among them, EQU-ADI algorithm of linear-complexity is proposed to solve RLC P/G networks with mesh-tree or mesh-chain topologies. Experimental results show that the proposed method is at least two-order of magnitude faster than SPICE while it can scale linearly in both time- and memory-complexity to solve very large P/G networks.

Keywords: RLC circuits, Analog circuit analysis, Time-domain analysis, P/G networks, Algorithm complexity.

As VLSI technology scales into nanometer regime, IC feature sizes continue shrinking and working frequency continues soar up. Thus, parasitic capacitances and inductances have significant impacts on signal integration<sup>[1-4]</sup> so that signals of digital circuits behave more like analog ones. Recently RLC model has been proposed to accurately analyze digital signals<sup>[1-3]</sup>. As billions of transistors are integrated into a high-end chip, traditional circuit simulators such as SPICE are inefficient for VLSI signal analyses owing to its intolerable complexity. As a result, a number of methods are proposed recently for efficient signal analysis, for instance wavelet analysis method<sup>[4]</sup> and s-domain circuit reduction simulation methods<sup>[1-3]</sup>. Among them, [1] compacts tree-constructed circuits while [2-3] can compact general-topology circuits.

In physical design, power/ground (P/G) networks are the special kind of analog circuits that are very large in terms of RLC components due to the fact that all transistors must get their power supply for P/G networks<sup>[5]</sup>. Meanwhile, due to soaring power consumption and clock frequency, P/G networks are more susceptible to current-induced reliability and functional failures of chips owing to excessive IR drops, *Ldi/dt* noise, electro-migration, and resonance effects<sup>[5]</sup>. Because signal integration of P/G networks decides whether transistors can get high supply voltage enough to drive correct logical transitions, algorithm study on efficient P/G network analysis is very important in both theory and application. Research along this line becomes an intensive research area in VLSI physical design<sup>[6-21]</sup>. Many efficient simulation techniques have been proposed for fast P/G grid analysis in the past. These methods include frequency-domain analysis methods<sup>[6-7]</sup>,

<sup>\*</sup>This work is found by the project of National Science Foundation of China (NSFC) No.60176016, Chinese 973 project under Grant No. 2005CB321604, and UC Senate Research Fund of America.

model reduction methods<sup>[8-10]</sup>, hierarchical methods<sup>[10-11]</sup>, the preconditioned conjugate gradient method (PCG)<sup>[12]</sup>, the alternating-direction-implicit method (ADI)<sup>[13]</sup>, Multi-grid methods (MG)<sup>[14-16]</sup>, equivalent circuit methods (EQU)<sup>[7,17]</sup>, and the last random walk method (RW)<sup>[18]</sup>.

On the other hand, all the existing methods have their drawbacks, which limits their applications. For instance, PCG based methods <sup>[12]</sup> are sensitive to the pre-conditioner used. MG methods <sup>[14-16]</sup> are typically efficient for mesh-structured circuits and are less memory efficient as more grid structures are stored due to the use of multiple grid approximations. Model reduction methods <sup>[8-10]</sup> require the current waveforms of all the current sources before the simulation, which may not be possible for P/G-device co-simulation and are less accurate for circuits with many (coupled) inductors due to one-point expansion. The random walk based method in [18] obtains its high efficiency from the localized waveform property in P/G circuits with center-bumped VDD/GND pads. But the method is unable to deal with general RLC circuit analysis. No method has exploited the special structure of typically linear RLC circuits, especially for VLSI on-chip P/G networks.

We have proposed four efficient transient analysis algorithms for RLC P/G networks to exploit the special topologies of typically VLSI P/G networks in the past<sup>[19-22]</sup>. Papers [19-20] propose two equivalent circuit algorithms for error-free compressing RLC trees and chains, respectively. Paper [21] presents a geometric multi-grid based algorithm for P/G networks of strict mesh topology. Paper [22] combines the advantages of EDU<sup>[19-20]</sup> and ADI<sup>[13]</sup> method and then, proposes a hybrid algorithm EDUADI for P/G networks of several topologies. The above papers only present specific algorithms, the whole picture of the time-domain analysis theory based on those algorithms for large-scale RLC circuits are not given.

This paper systematically describes a novel time-domain analysis methodology for large-scale RLC circuits. First it introduces a trapezoidal approximation technique, which is high accurate  $(O(h^2))$  and un-conditional stable, to transform transient RLC circuit analysis into quasi-static R-only circuit analysis. Then it proposes the following methodology to error-free compress the DC circuits for increasing the efficiency of large-scale RLC circuits. The methodology consists of the following three parts: 1. The approach is proposed to delete all intermediate nodes of RL branches. 2. An efficient approach is presented to compress and back-solve parallel and serial branches so that they are error-free and of linear complexity to analyze circuits of tree topology. 3. The Y to  $\pi$  transformation method is given to error-free reduce and back-solve the intermediate nodes of ladder circuits with the linear complexity. Thus, the whole simulation method is very accurate and of linear complexity to analyze circuits of chain topology. After RLC circuit discretization, there are many additional equivalent current sources in quasi-static R-only circuits. Present circuit analysis theory lacks efficient methods to further simplify these kinds of circuits. Thus, our circuit-simplifying methodology makes the important complements for present circuit analysis theory.

There are a number of applications induced from our methodology <sup>[19-22]</sup>. In order to show the generality and efficiency of our methodology, this paper only presents one typical application: EDU-ADI algorithm in [22]. The EQU-ADI algorithm of linear-complexity is proposed to solve RLC P/G networks of mesh-tree or mesh-chain topologies for ASIC chips. The algorithm first compresses tree and chain circuits with linear complexity; then it uses the improved ADI algorithm to solve the remaining mesh circuit with linear complexity; finally it back solves leaf nodes of tree circuits and internal (intermediate) nodes of chain circuits with linear complexity. The main advantages of the proposed EQUADI algorithm are the strict linear complexity and unconditional stability for convergence.

This paper is organized as follows. Section 1 describes the trapezoidal technique of high accurate  $(O(h^2))$  and un-conditional stability for transient circuit approximation. Section 2 describes the novel time-domain analysis methodology for large-scale RLC circuits. Section 3 presents its applications including our last research work, the EQUADI algorithm. And Section 4 gives concludes and future works.

# 1. Trapezoidal Approximation Technique for RLC Circuit Discretization

As for RLC circuits linked with ideal voltage and current sources, V(t) and I(t) are the node voltage vector and branch current vector, respectively. Their transient analyses can be formulated using modified nodal analysis (MNA) as follows:

$$\begin{pmatrix} C & 0 \\ 0 & L \end{pmatrix} \begin{pmatrix} \dot{V}(t) \\ \dot{I}(t) \end{pmatrix} + \begin{pmatrix} G & -A_l^T \\ A_l^T & 0 \end{pmatrix} \begin{pmatrix} V(t) \\ I(t) \end{pmatrix} = \begin{pmatrix} U(t) \\ 0 \end{pmatrix}$$
(1)

where *C*, *L* and *G* are coefficient matrices for capacitors, inductors and conductors while U(t) is the input vector generated by ideal voltage and current sources. Shown in above equation, the parasitical *C* and *L* largely increase the complexity of circuit analysis. Up to now, TLM-ADI method in [13] is the only method to directly solve EQ(1) with low complexity. But the method can solve only circuits of strict mesh topology, which limits its application. In order to efficiently solve RLC circuits, the general idea first uses numerical integration methods like Backward Euler<sup>[11]</sup> or trapezoidal approximation<sup>[15]</sup> to transform RLC transient circuit analysis into R-only quasi-static DC circuit analysis. In the same way, we also use trapezoidal approximation technique for RLC circuit discretization, which paves the way for describing our time-domain analysis methodology to error-free simplify the quasi-static circuits of far more current sources. Now, let's introduce the trapezoidal approximation technique for RLC circuit discretization in the following.

Assume h is the time step. We first use the trapezoidal approximation method for capacitance branch discretization as follows.

$$I_{c,k+1} = \frac{2C}{h} V_{c,k+1} - (\frac{2C}{h} V_{c,k} + I_{c,k})$$
(2)

where  $V_{c,k}$ ,  $I_{c,k}$ ,  $V_{c,k+1}$ ,  $I_{c,k+1}$  denote, respectively, the branch voltages and branch currents of the capacitor at step k and step k+1 respectively and C is the value of the capacitor. EQ(2) can be transformed into:

$$V_{c,k+1} - V_{c,k} = \frac{h}{2C} \left( I_{c,k+1} + I_{c,k} \right)$$
(3)

If  $V_{c,k+0.5}$  is backward or forthward expanded according to Tailor Expansion, we can get following equation:

$$\begin{cases} V_{c,k+0.5} = V_{c,k} + \frac{h}{2} \times \dot{V}_{c,k} + \frac{h^2}{8} \times \ddot{V}_{c,k} + O1(h^3) \\ V_{c,k+0.5} = V_{c,k+1} - \frac{h}{2} \times \dot{V}_{c,k+1} + \frac{h^2}{8} \times \ddot{V}_{c,k+1} + O2(h^3) \end{cases}$$
(4)

The difference of above two equations is shown in following

$$V_{c,k+1} - V_{c,k} = \frac{h}{2} (\dot{V}_{c,k+1} + \dot{V}_{c,k}) - \frac{h^2}{8} (\ddot{V}_{c,k+1} - \ddot{V}_{c,k}) + O(h^3)$$
  
$$= \frac{h}{2C} (I_{c,k+1} + I_{c,k}) - \frac{h^3}{8} \ddot{V}_{c,k+0.5} + O(h^3)$$
  
$$= \frac{h}{2C} (I_{c,k+1} + I_{c,k}) + \overline{O}(h^3)$$
 (5)

According to EQ(5), we can induce the following equation to show the  $O(h^2)$  error of EQ(2).

$$I_{c,k+1} = \frac{2C}{h} V_{c,k+1} - \left(\frac{2C}{h} V_{c,k} + I_{c,k}\right) + \overline{O}(h^2)$$

Reference [23] defines the trapezoidal method as the single-step implicit method and proves that the trapezoidal method is unconditional stable. For the content limitation, we don't prove the conclusion again.

In the same way, we also propose the trapezoidal approximation method for inductance branch discretization, which

is the  $O(h^2)$  accuracy and unconditional stable.

$$I_{L,k+1} = \frac{n}{2L} V_{L,k+1} + \left(\frac{n}{2L} V_{L,k} + I_{L,k}\right)$$
(6)

where  $V_{L,k}$ ,  $I_{L,k}$ ,  $V_{L,k+1}$ ,  $I_{L,k+1}$  denote, respectively, the branch voltages and branch currents of the inductor at step k and step k+1 while L is the value of the inductor.





With EQ(2) and EQ(6), we use trapezoidal approximation for capacitance and inductance branches. Shown in Fig.1, RLC circuits have been transformed into R-only DC circuits of so many equivalent current sources for quasi-static circuit analysis. The following is the MNA equation for discretized circuit analysis.

 $A \times V = B \tag{7}$ 

where  $A \ V \ B$  are conductance coefficient matrix, nodal voltage vector and current stimuli vector, respectively. The PCG algorithm can be directly used to solve EQ(7) for quasi-static circuits. Shown in Fig.1, because each branch of capacitance and inductance generates one additional equivalent current source after discretization, quasi-static circuits include a large number of current sources. Present circuit analysis theory is short of methods to simplify such circuits of many current sources. Thus, the methodology in the paper for error-free compressing such circuits of so many current sources makes the important complements for present circuit analysis theory.

# 2. A Novel Time-Domain Analysis Methodology for RLC Circuits

The methodology consists of the following three parts: 1. A method is proposed to delete all medial nodes of RL branches according to Norton theory. 2. According to Norton theory, error-free and linear-complexity methods are proposed to compress and back-solve parallel and serial branches so that they are error- free and linear complexity to analyze circuits of tree topology. 3. Based on KCL law, one Y to  $\pi$  transformation method is used to error-free cancel and back-solve the medial nodes of Y circuits with linear complexity. As the result, the method is error-free and of linear complexity to analyze circuits of chain topology. It is of error-free and linear complexity advantages to use the methodology for simplifying the above-mentioned quasi-static circuits. Now, let's first describe how to simplify the simple circuit branches in following subsection.

### 2.1 Simplification Method for Simple Circuit Branches

As for a RL branch shown in fig.2(a), there still is a internal nodes between two resistors after RLC circuit discretization. In order to simplify the branch of two resistor and a current for delete the internal node, we use Norton theory to merge two resistors, one is the static R and another is the equivalent resistor of L. This operation can be formulated as following.

$$\begin{cases} R^* = \frac{2L}{h} + R\\ el_{k+1} = \left(\frac{h}{2L}V_{L,k} + I_{L,k}\right) \times \frac{2L/h}{2L/h + R} \end{cases}$$
(8)

where  $I_{L,k}$  is the  $k^{th}$  RL branch current while  $V_{L,k}$  is the  $k^{th}$  inductance voltage obtained with the equation  $L^* d I_{L,k} / dt$ .



(a) Merging Resistors(b) Merging CurrentsFigure.2 Simple Branch Incorporation according to Norton Theory

Shown in fig.2(b), there are usually two currents for a node of discretization circuits, one is the absorption current generated by a circuit equipment and another is the equivalent current of capacitance linked to the node. The following equation can be formulated for merging two currents.

$$\begin{cases} ec_{k+1} = e_{k+1} - \left(\frac{2C}{h}V_k + I_k\right) \\ r = \frac{h}{2C} \end{cases}$$
(9)

#### 2.2 Simplification Methods for Serial Branches and Parallel Branches

BBL floor-planning based layout always contains a number of RLC trees in P/G networks <sup>[1,7,19]</sup>. Shown in fig.3(a), after discretization, P/G circuits include R-only trees of many current branches. In order to delete leaf nodes of trees, we should alternatively merge serial branches and parallel branches as shown in fig.3. Fig.3(a) and 3(c) merge serial branches while fig.3(b) merges parallel branches.



### **Figure.3 Flowchart of Tree Simplification**

We take the left tree of the *middle* node in fig.3(a) as example to explain how to error-free merge two serial branches. In the left tree, the leaf node *Leaf-l* has its load resistor  $r_{snl}$  and load current  $ec \Big|_{snl}^{k+1}$  while a resistor  $R_{snl}^*$  and a current  $el \Big|_{snl}^{k+1}$  link the leaf node to its father node *middle*. The symbol  $\Big|_{snl}^{k+1}$  here means the k+1<sup>th</sup> step and keeps this mean in remainder of this paper. According to Norton theory, the following equation is used to merge two serial branches and obtain the equivalent resistor  $r_{snl}^{\cdot}$  and equivalent current  $ec' \Big|_{snl}^{k+1}$  as the result.

$$\begin{cases} r'_{snl} = r_{snl} + R^*_{snl} \\ ec' \begin{vmatrix} k+1 \\ snl \end{vmatrix} = \frac{r_{snl}}{r_{snl} + R^*_{snl}} ec \begin{vmatrix} k+1 \\ snl \end{vmatrix} - \frac{R^*_{snl}}{r_{snl} + R^*_{snl}} el \begin{vmatrix} k+1 \\ snl \end{vmatrix}$$
(10)

Shown in fig.3(b), the following equation is formulated to merge triple resistors and currents as the equivalent resistor  $r'_{mdl}$  and the equivalent current  $ec'\Big|_{mdl}^{k+1}$  according to Norton theory.

$$\begin{cases} r'_{mdl} = \frac{r_{mdl} \times r'_{snl} \times r'_{snr}}{r_{mdl} \times r'_{snl} + r_{mdl} \times r'_{snr} + r'_{snl} \times r'_{snr}} \\ ec' \begin{vmatrix} k+1 \\ mdl = ec \end{vmatrix} \begin{vmatrix} k+1 \\ mdl + ec' \end{vmatrix} \begin{vmatrix} k+1 \\ snl \end{vmatrix} + ec' \begin{vmatrix} k+1 \\ snr \end{vmatrix}$$
(11)

Because EQ(10) and EQ(11) are formulated strictly according to Norton theory, the merging operations of serial branches and parallel branches are error-free.

Different from the simplification operations of simple branches in above subsection, once that base nodes of trees are known, the backward solving operation is requested to compute voltages for leaf nodes. We take fig.3(c) and 3(d) as example to explain how to back-solve the internal node voltage of serial branches. Assume the known voltage of the base node *root* in fig.3(d) as  $V \Big|_{rt}^{k+1}$ . The following equation is formulated to compute  $e \Big|_{rt}^{k+1}$ , the current flowing from the base

node into the tree.

$$e\Big|_{rt}^{k+1} = V\Big|_{rt}^{k+1} \div \dot{r_{rt}} + ec'\Big|_{rt}^{k+1}$$
(12)

Thus, we can further formulate the following equation to compute  $V\Big|_{mdl}^{k+1}$ , the *middle* node voltage, and  $e\Big|_{mdl}^{k+1}$ , the

current flowing from the node.

$$\begin{cases} V \begin{vmatrix} k+1 \\ mdl \end{vmatrix} = V \begin{vmatrix} k+1 \\ rt \end{vmatrix} - \left( e \begin{vmatrix} k+1 \\ rt \end{vmatrix} + el \begin{vmatrix} k+1 \\ mdl \end{vmatrix} \right) \times R^*_{mdl} \\ e \begin{vmatrix} k+1 \\ mdl \end{vmatrix} = e \begin{vmatrix} k+1 \\ rt \end{vmatrix} + el \begin{vmatrix} k+1 \\ mdl \end{vmatrix}$$
(13)

We further take fig.3(b) as example to explain how to distribute currents among parallel branches. The following equation is formulated to compute  $e\Big|_{snl}^{k+1}$ , currents of left branch and right branch.

$$\begin{cases} e \begin{vmatrix} k+1\\ snl \end{vmatrix} = ec' \begin{vmatrix} k+1\\ snl \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ mdl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ mdl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ mdl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ mdl \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} - ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} + ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + \left( e \begin{vmatrix} k+1\\ snr \end{vmatrix} + ec' \end{vmatrix} + ec' \begin{vmatrix} k+1\\ snr \end{vmatrix} + ec' + ec'$$

Because EQ(12)-EQ(14) are formulated strictly according to circuit laws, the back-solving operations in this paper are error-free.

With EQ(10)-EQ(14), we can certainly get the conclusion that simplification and back-solving operations for serial and parallel branches are error-free and of linear complexity. Thus, this subsection gives an error-free and linear-complexity algorithm for compressing and back-solving tree-topology circuits.

## 2.3 Error-free Circuit Transformation Model from Y to $\pi$

Cell-based layouts generate a large number of RLC chains <sup>[17,20]</sup>. After decretization, there are a large number of

additional equivalent currents in the quasi-static circuits. Shown in fig.4(a), Y-base circuit cells consisting of resistors and currents make up of a chain in the construction style of hand in hand. In order to delete the middle node of a Y-base cell, we must transform Y-base cells into  $\pi$ -based cells. And if we do *n* rounds of transformation from the left terminal of the chain to delete all *n* internal nodes of the chain, the reduced chain only consists of its two terminal nodes as shown in fig.4(b).



(a) Y-based cell

(**b**)  $\pi$ -based cell

Fig.4 Transformation from Y-based cell to π-based cell

Shown in fig.4(a),  $E_a$  and  $E_b$  are currents from Y-based cell to node *a* and *b*, respectively. Current directions are marked with arrows. The following equation is formulated for  $V_{ba}$ , the voltage difference between node *b* and *a*.

$$V_{ba} = V_b - V_a = R_a (E_a - I_a) + R_b \left[ E_a + I_c + \frac{V_a + R_a (E_a - I_a)}{R_c} + I_b \right]$$

$$= \left( R_a + R_b + \frac{R_a R_b}{R_c} \right) E_a + R_b I_c + R_b I_b + \frac{R_b}{R_c} V_a - R_a I_a - \frac{R_a R_b}{R_c} I_a$$
(15)

According to the present method of Y- $\pi$  transformation, the following equations are formulated for three equivalent resistors.

$$\begin{cases} R_{ab} = (R_a R_b + R_a R_c + R_b R_c) \div R_c \\ R_{ac} = (R_a R_b + R_a R_c + R_b R_c) \div R_b \\ R_{bc} = (R_a R_b + R_a R_c + R_b R_c) \div R_a \end{cases}$$
(16)

The equations are formulated for computing  $E_a$  and  $E_b$ , respectively.

$$\begin{cases} E_a = \frac{V_b - V_a}{R_{ab}} - \frac{V_a}{R_{ac}} + \left(\frac{R_a}{R_{ab}}I_a - \frac{R_b}{R_{ab}}I_b\right) - \left(\frac{R_c}{R_{ac}}I_c - \frac{R_a}{R_{ac}}I_a\right) \\ E_b = \frac{V_a - V_b}{R_{ab}} - \frac{V_b}{R_{bc}} - \left(\frac{R_a}{R_{ab}}I_a - \frac{R_b}{R_{ab}}I_b\right) - \left(\frac{R_c}{R_{bc}}I_c - \frac{R_b}{R_{bc}}I_b\right) \end{cases}$$
(17)

Then, we propose the definitions about three equivalent currents as follows.

$$\begin{cases} I_{ab} = \frac{R_a}{R_{ab}} I_a - \frac{R_b}{R_{ab}} I_b \\ I_{ac} = \frac{R_c}{R_{ac}} I_c - \frac{R_a}{R_{ac}} I_a \\ I_{bc} = \frac{R_c}{R_{bc}} I_c - \frac{R_b}{R_{bc}} I_b \end{cases}$$
(18)

Thus,  $E_a$  and  $E_b$  can be further formulated in the following equations that exactly match up with circuit expression in fig.4(b).

$$\begin{cases} E_{a} = \frac{V_{b} - V_{a}}{R_{ab}} - \frac{V_{a}}{R_{ac}} + I_{ab} - I_{ac} \\ E_{b} = \frac{V_{a} - V_{b}}{R_{ab}} - \frac{V_{b}}{R_{bc}} - I_{ab} - I_{bc} \end{cases}$$
(19)

The above equations are induced to transform Y-based cell to  $\pi$ -based cell. As the result, the internal node *c* of Y-based cell is deleted for current simplification. For a chain of *n* internal nodes, we only do *n* rounds of Y- $\pi$  transformations to compress the chain into the simplest circuit shown in fig.4(b). Because above equations are strictly induced according to circuit theory, the compression for chains is error-free.

Once two terminal nodes are known, we will describe how to back solve the unknown internal nodes of the chain in following part. With the known  $V_a$  and  $V_b$ , we can compute the  $E_a$  according to EQ(19). Then, the following equations are formulated to compute  $V_c$  for node c and  $E_c$ , the current flown from its right branch.

$$\begin{cases} V_c = V_a + (E_a - I_a)R_a\\ E_c = E_a + V_c \div R_c + I_c \end{cases}$$
(20)

If the chain has more than one internal node, we can use EQ(20) to back solve all unknown right-hand node neighbors. Because EQ(20) is induced according to circuit laws, the back-solving process is error-free.

Therefore, we propose a novel method to compress and back-solve chain circuits without errors in this subsection. And EQ(16,18-20) guarantee that the compression and back-solving process of chain circuits are of linear complexity. In general, we propose an error-free and linear-complexity method to compress and back-solve chain circuits including many current sources.

# 3. Theoretic Applications in P/G Network Analysis

Because we focus our research on P/G network design and verification, we mainly apply our methodology to speed up P/G grid analysis though our methodology can be also applied in analog signal analysis and clock network verification. Up to now, we have proposed four efficient algorithms for transient P/G network analysis as following<sup>[19-22]</sup>.

- 1. Based on the methodology in subsection 2.1-2.2, we propose an equivalent circuit algorithm for efficiently analyzing transient P/G networks of mesh-tree topology in [19]. Experiments demonstrate that our algorithm is one one-order of magnitude faster than the algorithm in [7]. Our algorithm first uses the methodology in subsection 2.1-2.2 to error-free compress all trees planted on the mesh, then uses PCG algorithm <sup>[12]</sup> to analyze the reduced mesh circuit, last uses the theory in subsection 2.2 to back solve the unknown leaf nodes for tree circuits.
- 2. Based on the methodology in subsection 2.1,2.3, we propose an equivalent circuit algorithm for efficiently analyzing transient P/G networks of mesh-chain topology in [20]. Experiments demonstrate that our algorithm is two one-orders of magnitude faster than SPICE. Our algorithm first uses the methodology in subsection 2.1,2.3 to error-free compress all chains, then uses PCG algorithm<sup>[12]</sup> to analyze the reduced mesh circuit, last uses the methodology in subsection 2.3 to back solve the unknown internal nodes for chain circuits.
- 3. Based on the methodology in subsection 2.1,2.3, we propose a geometric multi-grid based algorithm for efficiently analyzing transient P/G networks of strict mesh topology in [21]. Experiments demonstrate that our algorithm is two one-orders of magnitude faster than SPICE. Our algorithm combines the multi-grid algorithm with our methodology in subsection 2.1,2.3 to construct multi-level coarse grids for speeding up mesh circuit analysis.
- 4. We propose the EQU-ADI algorithm of linear-complexity to solve RLC P/G networks of mesh-tree or mesh-chain topologies in [22]. The algorithm is two-magnitude faster than HSPICE. It first compresses tree and chain circuits based on the methodology in subsection 2.1-2.3. Then it uses the improved ADI algorithm to solve the remaining mesh circuit. Finally it back solves leaf nodes of tree circuits and internal (intermediate) nodes of chain circuits according to our methodology in subsection 2.2, 2.3. The main advantages of the EQUADI algorithm are high accuracy, strict linear complexity, and unconditional stability

for convergence.

Through combining our methodology with hierarchical method, random walk method and ADI method, we also try studying more efficient algorithms for transient P/G network analysis and make visible progresses. In order to show the generality and efficiency of our methodology, we only give out one typical application, the EQU-ADI algorithm in following parts.

### 3.1 EQU-ADI Algorithm

Due to shorter marketing cycle of ASIC chip design, different from CPU's P/G networks of double-mesh topology, ASIC's P/G networks have to be of hybrid topology whose coarse upper-level grid is of mesh-topology for reliability and fine low-level grid is of tree or chain topology for design flexibility.

To this kind of P/G networks of hybrid topology, EQU-ADI algorithm is of linear complexity. The flowchart of EQU-ADI algorithm is shown in follows.

- 1. Based on the trapezoidal approximation technique in section2 and the methodology in subsection 3.1, EQU-ADI refreshes all circuit parameters.
- 2. Based on the methodology in subsection 3.2-3.3, it error-free compresses tree and chain circuits and transforms the origin circuit of hybrid topology into the reduced circuit of strict mesh topology with linear complexity. The reduce mesh circuit is shown in following figure.
- 3. It uses the improved ADI method to solve the quasi-static mesh circuit with linear complexity. The improved ADI method is of the unconditional stable advantage and the low-accuracy O(h) disadvantage.
- 4. Based on the methodology in subsection 3.2-3.3, it error-free back-solves the unknown leaf nodes of tree circuits and the unknown internal nodes of chain circuits with linear complexity.

From the algorithm flowchart, EQU-ADI algorithm is of strict linear complexity, which means it is the sharp-edge tool to efficiently analyze huge P/G networks for very large-scale ASIC chips. Although it is unconditional stable, it needs the short time step *h* owing to its low accuracy O(h). In this paper, we assign *h*=1/500T for accurate results.

### **3.2 Improved ADI Method**

After the compression process for tree circuits and chain circuits, the original P/G circuits of hybrid topology have been reduced into the strict mesh topology as shown in fig.5. The ADI method was proposed to solve PDE equations formulated from the RLC-mesh P/G grids. Since we directly work resistor-only quasi-static mesh circuit at each time step with aforementioned reduction, the TLM-ADI method in [13] can't be directly used to solve the resistor-only mesh circuit. In the following, we propose a novel improved ADI method for quasi-static mesh circuit analysis.



Figure.5 The reduced P/G circuit of strict mesh topology

In fig.5,  $r_{i,j}, R_{i,j,H}^*, R_{i-1,j,H}^*, el_{i,j,V}^{k+1}, ec_{i,j}^{k+1}$  are the known circuit parameters after circuit compression. In order to

use the improved ADI to solve the quasi-static mesh circuit, the horizontal implication operation at the  $k+0.5^{th}$  step uses  $k+I^{th}$  currents for horizontal branches and  $k^{th}$  currents for vertical branches to formulate the following equation according to KCL law.

$$V\Big|_{i,j}^{k+0.5} \div r_{i,j} + ec\Big|_{i,j}^{k+0.5} = \delta I_H\Big|_{i,j}^{k+1} + \delta I_V\Big|_{i,j}^k$$
(21)

where  $V_{i,j}^{k+0.5}, ec_{i,j}^{k+0.5}$  are the voltage of node (i,j) and the equivalent current linked to the node at  $k+0.5^{th}$  step. Meanwhile

 $\partial I_{H}|_{i,j}^{k+1}, \partial I_{V}|_{i,j}^{k}$  are two current differences between two branches in horizontal direction at the  $k+I^{th}$  step and between two branches in vertical direction at the  $k^{th}$  step. As the values at the  $k^{th}$  step are known, we only need to compute the values at the  $k+0.5^{th}$  step and the  $k+1^{th}$  step as follows.

$$V_{i,j}^{k+0.5} \div r_{i,j} + ec \Big|_{i,j}^{k+0.5} = 0.5V_{i,j}^{k+1} \div r_{i,j} + 0.5V_{i,j}^{k} \div r_{i,j} + 0.5\left(ec\Big|_{i,j}^{k} + ec\Big|_{i,j}^{k+1}\right)$$
  
=  $0.5g_{i,j}V_{i,j}^{k+1} + 0.5\left(g_{i,j}V\Big|_{i,j}^{k} + ec\Big|_{i,j}^{k} + ec\Big|_{i,j}^{k+1}\right)$  (22)

$$\begin{aligned} \partial I_{H} \Big|_{i,j}^{k+1} &= I_{H} \Big|_{i,j}^{k+1} - I_{H} \Big|_{i-1,j}^{k+1} \\ &= \left[ \! \left[ \! V \Big|_{i+1,j}^{k+1} \! - \! V \Big|_{i,j}^{k+1} \right] \! \div R_{i,j,H}^{*} + el \Big|_{i,j,H}^{k+1} \right] \! - \left[ \! \left[ \! V \Big|_{i,j}^{k+1} \! - \! V \Big|_{i-1,j}^{k+1} \right] \! \div R_{i-1,j,H}^{*} + el \Big|_{i-1,j,H}^{k+1} \right] \\ &= G_{i,j,H}^{*} V \Big|_{i+1,j}^{k+1} \! - \! \left( \! G_{i,j,H}^{*} + G_{i-1,j,H}^{*} \right) \! V \Big|_{i,j}^{k+1} + G_{i-1,j,H}^{*} V \Big|_{i-1,j}^{k+1} + \left( el \Big|_{i,j,H}^{k+1} - el \Big|_{i-1,j,H}^{k+1} \right) \end{aligned}$$
(23)

where  $g_{i,i}, G_{i,i,H}^*, G_{i-1,i,H}^*$  are the conductance corresponding to the resistors  $r_{i,i}, R_{i,i,H}^*, R_{i-1,i,H}^*$  as shown in Fig.5. With EQ(22-23), EQ(21) can be transformed into the following equation.

$$0.5g_{i,j}V_{i,j}^{k+1} + 0.5(g_{i,j}V_{i,j}^{k} + ec_{i,j}^{k} + ec_{i,j}^{k+1}) = \delta I_V_{i,j}^{k} + \delta I_V_{i,j}^{k} + G_{i,j,H}^{*}V_{i+1,j}^{k+1} - (G_{i,j,H}^{*} + G_{i-1,j,H}^{*})V_{i,j}^{k+1} + G_{i-1,j,H}^{*}V_{i-1,j}^{k+1} + (el_{i,j,H}^{k+1} - el_{i-1,j,H}^{k+1})$$
(24)  
Then EQ(24) can be further simplified into the following equation

Then, EQ(24) can be further simplified into the following equation.

$$-G_{i-1,j,H}^{*}V\Big|_{i-1,j}^{k+1} + \left(G_{i,j,H}^{*} + G_{i-1,j,H}^{*} + 0.5g_{i,j}\right)V\Big|_{i,j}^{k+1} - G_{i,j,H}^{*}V\Big|_{i+1,j}^{k+1} \\ = \delta I_{V}\Big|_{i,j}^{k} + \left(el\Big|_{i,j,H}^{k+1} - el\Big|_{i-1,j,H}^{k+1}\right) - 0.5\left(g_{i,j}V\Big|_{i,j}^{k} + ec\Big|_{i,j}^{k} + ec\Big|_{i,j}^{k+1}\right)$$
(25)

The right term of EQ(25) is the known current while three left terms are unknown. With EQ(25), all nodes of each row compose of a tri-diagonal diagonal-dominant matrix that can be solved in linear complexity.

Then at the  $k+1.5^{th}$  step, we change the implication direction from horizon to uprightness. In order to use the improved ADI to solve the quasi-static mesh circuit, the vertical implication operation at the  $k+1.5^{th}$  step uses  $k+2^{th}$ currents for vertical branches and  $k+1^{th}$  currents for horizontal branches to formulate the following equation according to KCL law.

$$V\Big|_{i,j}^{k+1.5} \div r_{i,j} + ec\Big|_{i,j}^{k+1.5} = \delta I_H\Big|_{i,j}^{k+1} + \delta I_V\Big|_{i,j}^{k+2}$$
(26)

Similarly, EQ(26) can be transformed into the following equation. And all nodes of each column also compose of a tri-diagonal matrix so that they can be solved with linear complexity.

$$-G_{i,j-1,H}^{*}V\Big|_{i,j-1}^{k+2} + \left(G_{i,j,V}^{*} + G_{i,j-1,V}^{*} + 0.5g_{i,j}\right)V\Big|_{i,j}^{k+2} - G_{i,j,V}^{*}V\Big|_{i,j+1}^{k+2} \\ = \delta I_{H}\Big|_{i,j}^{k+1} + \left(el\Big|_{i,j,V}^{k+2} - el\Big|_{i,j-1,V}^{k+2}\right) - 0.5\left(g_{i,j}V\Big|_{i,j}^{k+1} + ec\Big|_{i,j}^{k+1} + ec\Big|_{i,j}^{k+2}\right)$$
(27)

With EQ(25) and EQ(27), we propose the improved ADI method of the linear complexity to quasi-statically simulate the remaining circuit of pure mesh topology shown in Fig.5. At the same time, the improved ADI method is also unconditional stable as the resistor-only network matrix is symmetric positive definite, which is unconditional stable and ADI algorithm is always convergent<sup>[23]</sup>.

#### **3.3 Experimental Results**

The EDU-ADI algorithm has been implemented in C++. All the experimental results are collected on a *SUN V880* workstation with 750MHz *Ultra Sparc* CPU and 2GB memory. The number of time steps is assigned 500 for one clock cycle and VDD is 2.5V. Although EDQ-ADI can linear-complexity solve P/G network of mesh-tree, mesh-chain, and mesh-(tree+chain) hybrid topologies, limited by content, we take P/G networks only of mesh-tree topology as test cases to show the linear complexity property of EQU-ADI algorithm. Also limited by content, we give out only experimental results on complexity and omit other results on stability and accuracy. In fact, the results on stability and accuracy are similar to their counterparts in [13].





In fig.6(a), horizontal unit is 200K nodes of P/G circuits while vertical unit is 500 seconds of running time. Although HSPICE is a general main-stream commercial software being constantly optimized by Synopsis company, it takes intolerable long running time (nearly 5000S) to simulate a P/G circuit of less than100K nodes, which means HSPICE can not manage large-scale practical circuits of more than 1M nodes. Thus in study on P/G circuit simulation, HSPICE is used to test the accuracy and efficiency of novel algorithms. If an algorithm is two-magnitude faster than HSPICE, it can be regarded as successful one. Shown in fig.6(a), the EDU-ADI algorithm can two-magnitude speedup the P/G circuit simulation. Meanwhile the strict linearity on time complexity of the EDU-ADI algorithm demonstrates the theoretic correctness of our methodology.

In fig.6(b), vertical unit is 50M byte of memory. Shown in fig.6(a), HSPICE is of large memory complexity and consumes intolerable memory (more than 200M) to simulate a P/G circuit of less than100K nodes. On the other hand, our EDU-ADI algorithm is of strict linear memory complexity and can save much more memory consumption, which demonstrate the correctness of our methodology.

Summing up results of two above figures, the EQU-ADI algorithm does be of low-coefficient linear complexity, which shows our methodology can compress P/G circuits for efficient simulation. Because the EQU-ADI algorithm can linear simulate P/G circuits of mesh-tree, mesh-chain, and mesh-tree+chain topologies, the algorithm can efficiently solve huge practical P/G networks for very large-scale ASIC chips.

## 4. Conclusion

In this paper, we have proposed a novel time-domain analysis methodology for general large RLC circuits and have applied the new methodology for the transient large-scale VLSI on-chip power delivery network analysis. We describe a trapezoidal approximation technique for RLC circuit discretization with high accuracy and unconditional stable advantages. A large number of current sources in discretization circuits make them difficult to further reduce the quasi-static circuits. We proposed a novel time-domain analysis methodology to reduce such quasi-static circuits linearly in an error-free manner, which are important complement to present circuit analysis theory. According to its applications in P/G network simulation, our methodology is very important in both theory and applications for simulating large-scale RLC networks. The EDU-ADI algorithm, the only detail described application, also show the superiority of our methodology.

In future, we will extent the methodology to other research fields such as clock network simulation and static timing analysis. In P/G network design and verification, we want to combine the methodology with the hierarchical method for further speeding up the simulation and optimization operations for very large-scale P/G networks with complex topologies.

#### References

- Yang X D, Cheng C K, Ku W H, et al. Hurwitz stable reduced order modeling for RLC interconnect trees. IEEE Journal of Analog Integrated Circuits and Signal Processing, 2002, 31(3): 222-228
- Qin Z, and Cheng C K. RCLK-VJ network reduction with Hurwitz polynomial approximation. In: Proc IEEE Asia and South Pacific Design Automation Conf (ASPDAC), 03EX627.Piscataway: IEEE Press, 2003. 283-291
- Tan X D. A general s-domain hierarchical network reduction algorithm. In: Proc IEEE/ACM Int Conf Computer Aided Design, 477033.New York: ACM Press, 2003. 650-657
- 4. Li X, Zeng X, Zhou D, et al. Behavioral modeling of analog circuits by wavelet collocation method. In: Proc IEEE/ACM Int Conf Computer Aided Design, 478010.New York: ACM Press, 2001. 65-69
- Dharchoudhury A, Panda R, Blaauw D, et al. Design and analysis of power distribution networks in power PC microprocessors. In: Proc IEEE/ACM Design Automation Conf, 477980.New York: ACM Press, 1998. 738–743
- Bai G, Bobba S, Hajj I N. Simulation and optimization of the power distribution network in VLSI circuits. In: Proc IEEE/ACM Int Conf Computer Aided Design, 477006.New York: ACM Press, 2000.12: 481–486
- Su H H, Gala K H, and Sapatnekar S S. Fast analysis and optimization of power/ground networks. In: Proc IEEE/ACM Int Conf Computer Aided Design, 477006.New York: ACM Press, 2000. 477–482
- Wang J M, and Nguyen T V. Expended Krylov subspace method for reduced order analysis of linear circuits with multiple sources. In: Proc IEEE/ACM Design Automation Conf, 477000.New York: ACM Press, 2000. 247-252
- Odabasioglu, Celik M, and Pilleggi L T. PRIME: passive reduction-order interconnect macro-modeling algorithm. IEEE Trans Computer Aided Design, 1998. 17(8): 645-654
- Cao Y, Lee Y, Chen T, et al. HiPRIME: hierarchical and passivity reserved interconnect macromodeling engine for RLKC power delivery. In: Proc IEEE/ACM Design Automation Conf, 477021.New York: ACM Press, 2002. 379–384
- 11. Zhao M, Panda R V, Sapatnekar S S, et al. Hierarchical analysis of power distribution networks. In: Proc IEEE/ACM Design Automation Conf, 477000.New York: ACM Press, 2000. 150-155
- 12. Chen T, Chen C C. Efficient large-scale power grid analysis based on preconditioned Krylov-subspace iterative method. In: Proc IEEE/ACM Design Automation Conf, 477011.New York: ACM Press. 2001. 559–562
- Lee Y M, Chen C P. Power grid transient simulation in linear time based on transmission-line-modeling alternating direction implicit method. In: Proc IEEE/ACM Int Conf Computer Aided Design, 478010.New York: ACM Press, 2001. 75–80
- Nassif S R, and Kozhaya J .N. Fast power grid simulation. In: Proc IEEE/ACM Design Automation Conf, 477000.New York: ACM Press, 2000. 156-161
- Zhu Z, Yao B, Cheng C K. Power network analysis using an adaptive algebraic multigrid approach. In: Proc IEEE/ACM Int Conf Computer Aided Design, 477033.New York: ACM Press, 2003. 105-108
- Su H H, Sani E A, Nassif R. Power grid reduction based on algebraic multigrid principles. In: Proc IEEE/ACM Design Automation Conf, 477031.New York: ACM Press, 2003. 109-112
- 17. Tan X D. Shi C J. Fast power-ground network optimization using equivalent circuit modeling. In: Proc IEEE/ACM Design Automation Conf,

477011.New York: ACM Press. 2001. 550-554

- Qian H F, Nassif R, Sapatnekar S S. Random walks in a supply Network. In: Proc IEEE/ACM Design Automation Conf, 477031.New York: ACM Press, 2003. 93-98
- Cai Y C, Pan Z, Luo Z Y, et al. Fast reduction and reconstruction strategy in analyzing power/ground network with mesh and tree structure. Journal of Computer Science and Technology, 2005, 20(2): 224-230
- Pan Z, Cai Y C, Luo Z Y, et al. Transient analysis of on-chip power distribution networks using equivalent circuit modeling. In: Proc ACM Int Symposium On Quality Electronic Design, PR01881. Piscataway: IEEE Press, 2004. 63-68
- 21. Cai Y C, Pan Z, Luo Z Y, et al . Geometric multigrid based algorithm for transient RLC power/ground (P/G) grids analysis. Journal of Computer-Aided Design & Computer Graphics (In Chinese), 2005, 17(4): 33-38
- 22. Wang X Y, Luo Z Y, Tan X D, et al. EQUADI: A linear complexity algorithm on transient power/ground(P/G) network analysis for ASICs. In: Proc *IEEE* Int Conf on Solid-state Integrated Circuit Technology, 04EX863.Piscataway: IEEE Press, 2004. 1952-1955
- 23. Ye Guan, Jinliang Chen. Digital Computing Methodology. 1st Ed, 1990, Beijing: Qinghua University Press, 245-246