

## Optimization Technique for Performance driven Cell Generator

Nadine Azemard, Vincent Bonzom, Sylvie Amat, Daniel Auvergne

### ▶ To cite this version:

Nadine Azemard, Vincent Bonzom, Sylvie Amat, Daniel Auvergne. Optimization Technique for Performance driven Cell Generator. EuroASIC 1992 - European Conference on Application Specific Integrated Circuits, Jun 1992, Paris, France. pp.152-155, 10.1109/EUASIC.1992.228032 . limm-00239390

### HAL Id: lirmm-00239390 https://hal-lirmm.ccsd.cnrs.fr/lirmm-00239390

Submitted on 19 Mar 2022

**HAL** is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

#### Optimization technique for performance driven cell generator

N.AZEMARD, V.BONZOM, S.AMAT, D.AUVERGNE

LIRMM : LABORATOIRE D'INFORMATIQUE DE ROBOTIQUE ET DE MICROÉLECTRONIQUE DE MONTPELLIER URA CNRS DO 1480 UNIVERSITÉ MONTPELLIER II, SCIENCES ET TECHNIQUES DU LANGUEDOC Place Eugène Bataillon, 34095 Montpellier cedex 5 FRANCE Telex: 490944F, Fax: (+33) 67 52 14 96, Phone: (+33) 67 14 32 08

#### Abstract

We present in this paper the definition of explicit sizing laws allowing fast optimization of data path. Validation of these laws is obtained with respect to mathematical global optimization method (Rosenbrock's method). Example of application to real data path is given by integration of these sizing facilities in flexible cell generator.

#### 1. Summary

Although semicustom approaches such as standard cell and gate array allow ASIC design with a fast turn around time, they are not suitable for advanced performance design techniques which require customized transistor sizes with respect to physical design issues such as layout, interconnection routing and timing. As an attempt to overcome these limitations, we have proposed an adaptative cell generator approach. This generator [1], developed as part of the design automation project of the Microelectronics Department (LIRMM) of Montpellier, generates cells for custom design with post layout evaluation of the performances. The transistor netlist is defined from a post layout extraction or stick input prediction. We present in this paper the integration of a local heuristic based sizing aid, allowing automatic sizing of the transistors from the extracted netlist, taking into account the individual transistor's capacitive loads and the global logic structure.

#### 2. Prior work

Many researchers have addressed the transistor sizing problem. The primary concern is to optimize the delays of a circuit according to given constraints imposed on area or power. Sometimes, the constraints imposed on delays result in area constraints; the final area gives the power performances and yield. There are generally three techniques to determine global solutions to a constrained optimization problem. One is the mathematical approach [2,3,4,5,6], applying the classical augmented Lagrangian multiplier, a feasible direction method. This method requires the derivatives with respect to the design parameters. If the objective and constraints are convex, this algorithm converges to the global minimum. However it is quickly limited to low complexity (divergence branches) structures and needs good initial solutions. The second approach is an heuristic or incremental method [7,8,9] in which sensitive transistors on critical paths are sized until the specifications are met. The third approach [10,11] is a combination of the two preceding ones in which an initial sizing solution is defined from an heuristic algorithm and a global solution is obtained from a mathematical optimization technique. These methods are usually time consuming, limited to identified data path and have the accuracy of the delay model used to elaborate the cost function. In fact, they are " ad posteriori " methods which never give information for mapping selection or physical layout alternatives.

#### 3. Optimal transistor sizing

In order to circumvent the limitations due to the number of parameters to be optimized, the convergence difficulties induced by the divergence nodes and prohibitive CPU time, we present here a new attempt in defining explicit sizing law of complex gates. We showed in preceding paper that critical delay evaluation of complex gate [12] can be obtained from the study of the charging or discharging action of the last N or P transistor connected to the supply rail. As a result delay evaluation on complex gate can be obtained from an equivalent macrocell constituted of a N or a P transistor, loaded by the serial transistor array which connects this transistor to the output load, and controlled by a reference inverter. These results in an expression of delays (for rising Tlh, and falling Thl edges) which is always convex in the size of the transistors [13].

Taking the size of the N transistor of the controlling device as a reference, the solutions of the sizing problem involve the determination of three parameters: the size of the P transistor of the controlling devices and the sizes of the N and P transistors of the complex gate under consideration.

Usual cancellation of the derivatives of the delays with respect to the size of the transistors does not permit to fix all the parameters. Explicit solution to this sizing problem has been obtained by reducing the solution space to be explored to the following set :

- P transistor sizes are imposed proportional to the size of the N transistor, resulting in regular implementation which gives always layout solution with minimum parasitic capacitances [14],

- then cancellation of the partial derivatives of the rise and fall times with respect to the reduced size  $X_N$  of the N transistor of the gate under consideration, results into two solutions that we imposed to be identical,

- at last, for this sizing solution, we impose identical delay value (Thl = Tlh) resulting in symmetrical edges and safe propagation of delays on data path.

We have to note here that this constitutes a reasonable assumption in the way that there is no interest to get a very fast falling edge for example, if the rising one is very slow, except if at any time we can control what happens on the data path. This would impose to master the complexity of the system which becomes analog in time space. Without practical use this solution is never explored.

Solving for the derivatives of the delay with the preceding conditions, we obtain the generalized sizing equations for complex gates, as follows :

$$X_{N} = \frac{\sqrt{Y'} \cdot (1 + 6K'(n_{2} - 1))}{\left(1 + \frac{\mu_{N}}{\mu_{P}} + 6K'[(n_{2} - 1) + \frac{\mu_{N}}{\mu_{P}}(n_{1} - 1)]\right)^{\frac{1}{2}}}$$

$$X_{P} = \frac{\frac{\mu_{N}}{\mu_{P}}\sqrt{Y'} \cdot (1 + 6K'(n_{1} - 1))}{\left(1 + \frac{\mu_{N}}{\mu_{P}} + 6K'[(n_{2} - 1) + \frac{\mu_{N}}{\mu_{P}}(n_{1} - 1)]\right)^{\frac{1}{2}}}$$

where:  $X_N$ ,  $X_P$  are the respective widths of the N, P transistors normalized with respect to the width of the N transistor of the controlling device,

Y<sup>\*</sup>, is the load normalized with respect to the input active capacitance of the N reference transistor,

K', is the propagation constant, as given in [13],

 $n_1, n_2$  respectively label the number of serial transistors in a P and N array.

 $\mu_N/\mu_P$  and K' represent the technological parameters,  $n_1$  and  $n_2$  specify the structure, note that  $n_1 = n_2 = 1$ , gives the general sizing law for inverters.

# 4. Validation and application to post layout optimization

Validation of these expressions has been done on different gate configurations using ROSENBROCK's rotating coordinating scheme [15] as a robust algorithm to set up a global mathematical optimization method as an element of reference.

We compare in Table I the optimized sizing solutions from eq. 1 to the ROSENBROCK's one, for different gates controlled by an identical inverter (this guarantees real control situations and convexity of the sizing problem [12]). As shown, the good level of agreement obtained (less than 10% discrepancy), confirms the validity of our approach in defining sizing laws and mostly in reducing the solution space to get an explicit expression.

| Y*  | Structure | n1 | n <sub>2</sub> | Mather<br>Optim<br>(Roser | matical<br>ization<br>ibrock) | Explicit<br>Optimization<br>(eq. 1) |                |  |
|-----|-----------|----|----------------|---------------------------|-------------------------------|-------------------------------------|----------------|--|
|     |           |    |                | X <sub>N</sub>            | X <sub>P</sub>                | X <sub>N</sub>                      | X <sub>P</sub> |  |
|     | Inverter  | 1  | 1              | 1.44                      | 3.58                          | 1.31                                | 3.3            |  |
|     | Nand 3e   | 1  | 3              | 2.4                       | 3.1                           | 2.19                                | 2.95           |  |
|     | Nand 6e   | 1  | 6              | 3.7                       | 3.02                          | 3.27                                | 2.58           |  |
| 10  | Nor 3e    | 3  | 1              | 1.14                      | 5.3                           | 1.13                                | 4.82           |  |
|     | Nor 6e    | 6  | 1              | 1.1                       | 7.75                          | 0.81                                | 6.52           |  |
|     | Andori    | 6  | 3              | 1.4                       | 7.6                           | 1.48                                | 6.2            |  |
|     | Andori    | 3  | 6              | 3.1                       | 4.6                           | 2.78                                | 4.1            |  |
|     | Inverter  | 1  | 1              | 3.2                       | 8.01                          | 2.9                                 | 7.3            |  |
|     | Nand 3e   | 1  | 3              | 5.4                       | 7                             | 4.9                                 | 6.6            |  |
|     | Nand 6e   | 1  | 6              | 8.1                       | 6.3                           | 7.3                                 | 5.8            |  |
| 50  | Nor 3e    | 3  | 1              | 2.5                       | 12                            | 2.5                                 | 10.8           |  |
|     | Nor 6e    | 6  | 1              | 2.2                       | 17                            | 1.8                                 | 14.6           |  |
|     | Andori    | 6  | 3              | 3                         | 16                            | 3.3                                 | 13.9           |  |
|     | Andori    | 3  | 6              | 6.9                       | 9.8                           | 6.2                                 | 9.2            |  |
|     | inverter  | 1  | 1              | 4.5                       | 11.3                          | 4.14                                | 10.4           |  |
|     | Nand 3e   | 1  | 3              | 7.6                       | 10                            | 6.9                                 | 9.3            |  |
|     | Nand 6e   | 1  | 6              | 11.4                      | 8.8                           | 10.3                                | 8.2            |  |
| 100 | Nor 3e    | 3  | 1              | 3.5                       | 16.6                          | 3.6                                 | 15.2           |  |
|     | Nor 6e    | 6  | 1              | 2.96                      | 23.2                          | 2.6                                 | 20.6           |  |
|     | Andori    | 6  | 3              | 4.3                       | 2.3                           | 4.7                                 | 19.6           |  |
|     | Andori    | 3  | 6              | 9.7                       | 14                            | 8.8                                 | 13             |  |

<u>Table 1</u>: Comparison of the sizing solutions obtained from eq.1 to the value calculated with the global optimization method used as a reference :  $X_{N,P}$  and Y are, respectively, the reduced values of transistor widths and load, defined with respect to 3  $\mu$ m and 5.16 fF, corresponding to the minimum values of the transistor width and the transistor capacitance used in the target technology.

We applied these sizing laws to optimize real data paths, using the industrial HSPICE's optimizer as a reference for multipath optimization. Layouts of circuits, indicated in Table II, have been generated through the flexible cell generator (PRINT) [1], in linear matrix style (figure 1) and extracted under a CADENCE environment.

The size of the extracted transistors was optimized with

the three following methods :

- ROSENBROCK's procedure (global mathematical optimization) for simple data paths,

- HSPICE optimizer (incremental procedure),

- with eq.1 implemented as an automatic sizing module (P.Size) in the PRINT generator.

Note that the HSPICE optimizer always needs a good initial solution which was fed from the first iteration defined through this equation.

The delay time and power values shown in Table II were obtained from electrical simulations with HSPICE from the different optimized netlists.

For each configuration we can observe that the values obtained for delay and power solutions, with the different methods, are very similar. However the areas obtained with the calculated solutions are smaller, this is particularly true for the CPU time values which are very much lower (5 ms for P.Size compared to 60 h with HSPICE on a 4 MIPS VAX station). This result is promising regarding the possibility offered by eq.1 in treating complex circuits in a reasonable time. Note that eq.1 can always be used to obtained realistic initial sizing solutions for any global optimization method.

| INITIAL SIZING – BEFORE LAYOUT EXTRACTION |     |                                       |      |      |                       |      |      |                          |      |      |                          |      |      |
|-------------------------------------------|-----|---------------------------------------|------|------|-----------------------|------|------|--------------------------|------|------|--------------------------|------|------|
|                                           |     | Minimal Sizing                        |      |      | Local Sizing (P.Size) |      |      | Rosenbrock Sizing        |      |      | HSPICE Sizing            |      |      |
| Circuit                                   | Ntr | Tđ                                    | Pd   | Tcpu | Td                    | Pd   | Tcpu | Td                       | Pd   | Тсри | Td                       | Pd   | Тсри |
| INV4                                      | 8   | 13.6                                  | 0.66 | 0    | 2.0                   | 0.85 | 2ms  | 1.75                     | 1.1  | 0.4s | 1.75                     | 1.03 | 41mn |
| CarryGen                                  | 52  | 4.1                                   | 0.26 | 0    | 3.5                   | 0.54 | 5ms  | Out of application range |      |      | 3.0                      | 0.51 | 60h  |
| ALU4                                      | 418 | 8.1                                   | 0.42 | 0    | 4.7                   | 0.51 | 0.3s | Out of application range |      |      | 4.4                      | 0.53 | 127h |
|                                           | :   | FINAL SIZING - POST LAYOUT EXTRACTION |      |      |                       |      |      |                          |      |      |                          |      |      |
| INV4                                      | 8   | 15.3                                  | 0.75 | 0    | 2.95                  | 1.07 | 2ms  | 2.9                      | 1.64 | 0.5s | NO CONVERGENCE           |      |      |
| CarryGen                                  | 52  | 17.2                                  | 0.98 | 0    | 7.2                   | 2.14 | 5ms  | Out of application range |      | 7.0  | 2.23                     | 60h  |      |
| ALU4                                      | 418 | 75                                    | 1.02 | 0    | 19                    | 1.43 | 0.3  | Out of application range |      |      | Out of application range |      |      |

<u>Table.II</u>: Comparison of the different sizing solutions (initial, local, global solutions) for delay (Td in ns), power dissipated (Pd in mW for a 10 MHertz clock frequency), silicium area (Sc in  $\mu m^2$ ) and CPU time (Tcpu required to size different circuits). For complex circuits (ALU4, Carry Gen.) global sizing didn't allow convergence solution.

#### 5. Conclusion

A big effort, in the design of ICs, is made to evaluate circuit performances and to look for ways to satisfy the constraints.

We have presented in this paper an optimization technique, embedded (P.Size) in a flexible cell generator, to optimize real data paths in their full environment. The optimization procedure has been proved to be as efficient as global optimization procedures implemented with mathematical (ROSENBROCK) or incremental (HSPICE) methods. The sizing equations we present allow to optimize data paths from a layout representation with very small computational requirements, opening the way to automatic sizing of very large circuits.

The integration of these equations in a flexible cell generator gives, to the complete system, the ability to speed circuits up by optimal sizing through automatic layout–evaluation–sizing cycles.

The availability of explicit sizing laws opens the way to predict the performances of complex structures at the layout level, allowing bottom up design information to be used at higher levels of the design process.

#### 6. References

- [1] M.ROBERT, D.DESCHACHT, G.CATHEBRAS, S.PRAVOSSOUDOVITCH, D.AUVERGNE : "PRINT methodology: a compilation approach for cell library generation", 'ISCAS'88, Vol.2, p965–968, FINLAND, 1988.
- [2] C.LEE, M.SOUKUP: "An Algorithm for CMOS Timing and Area Optimization", IEEE Journal of Solid States Circuits, vol. SC-19, nº5, p781-787, October 1984.
- [3] K.S.HEDLUND: "AESOP : A tool for Automated Transistor Sizing", 24th ACM/IEEE Design Automation Conference, p43–48, 1987.
- [4] D.MARPLE : "Transistor Size Optimization in the Tailor Layout System" Proceedings of the IEEE Design Automation Conference, p43–48, 1989.
- [5] M.R.C.M.BERKELAAR, J.A.G.JESS: "Gate sizing in MOS digital circuits with linear programming", Proceedings of the IEEE Design Automation Conference, p217-221, 1990.

- [6] H.Y.CHEN, S.M.KANG : "ICOACH : A circuit optimization aid for CMOS high-performance circuits", INTEGRATION, the VLSI journal 10 (1991), p185-212.
- [7] J.P.FISHBURN, A.E.DUNLOP : "TILOS : a Posynomial Programming Approach to Transistor Sizing", Proceeding IEEE International Conference on CAD, p326–328, nov 1985.
- [8] M.HOFMANN, J.K.KIM: "Delay Optimization of Combinational Static CMOS Logic"24<sup>th</sup> ACM/IEEE Design Automation Conference, p125–132, 1987.
- [9] J. YUAN, C.H. SVENSSOM : "CMOS Circuits Speed Optimization based on Switch level simulation", IEEE Proceedings International Symposium on Circuits and Systems (ISCAS'88), p2109–2112, Helsinki, Finland, June 1988.
- [10] J.M.SHYU, A.SANGIOVANNI-VINCENTELLI, J.P.FISHBURN and A.E. DUNLOP: "Optimization based transistor sizing", IEEE J. of Solid State Circuits, vol.23, n°2, p400-409, april 1988.

- [11] F.W.OBERMEIER, R.KATZ : "An Electrical Optimizer that consider Physical Layout", 25<sup>th</sup> ACM/IEEE DAC, p 453–459, 1988.
- [12] D.AUVERGNE, N.AZEMARD, D.DESCHACHT, M. ROBERT: "Evaluation dynamique et optimisation des structures CMOS et VLSI", TSI, vol.8, nº6, p593-607, Decembre 1989.
- [13] D.DESCHACHT, M.ROBERT, D.AUVERGNE: "Explicit Formulation of delays in CMOS data paths", IEEE Journal of Solid States Circuits, vol. 23, nº5, p1257-1264, Oct 1988.
- [14] M.ROBERT, G.CATHEBRAS, D.DESCHACHT, J.TRAUCHESSEC, D.AUVERGNE: "Linear matrix oriented optimal automatic layout of digital cells", The European Design Automatic Conference, Amsterdam, The Netherlands, 25–28 February 1991.
- [15] D.M.HIMMELBLAU, Applied non linear programming : "Rosenbrock's method", chp.4, Mc Graw Hill ed. 1972.



Figure 1: Example of layout, for the Carry Gen. cell of Table 2, which has been automatically sized and generated with the P.SIZE and the PRINT tools respectively.