# Hybrid Logical Effort for Hybrid Logic Style Full Adders in Multistage Structures Hareesh-Reddy Basireddy, Karthikeya Challa, and Tooraj Nikoubin<sup>©</sup>, Senior Member, IEEE Abstract—One of the critical issues in the advancement of very large scale of integration circuit design is the estimation of timing behavior of the arithmetic circuits. The concept of logical effort provides a proficient approach to comprehend and assess the timing behavior of circuits with conventional CMOS (C-CMOS) structure. However, this technique is not working for circuits with a hybrid structure. On the other hand, numerous circuits with the hybrid structure which are faster and consume less power than C-CMOS one have been proposed for different applications such as portable and IoT devices. In this regard, the necessity of having and use of a simple and efficient timing behavior method like conventional logical effort for analysis of the hybrid adder circuits is inevitable. This paper proposes an efficient analysis and modeling technique that enables designers to assess the timing behavior of hybrid full adder circuits at the block level and anticipate their performance in multistage circuits. The gain and selection factor are introduced as a criterion for accurate selection and optimization of the hybrid adder cells measurable on the single test bench for management of energy efficiency and performance tradeoff. The proposed method is investigated using 32-nm CMOS and FinFET technologies. *Index Terms*— Drivability, hybrid CMOS, hybrid logical effort, input capacitance, logical effort, timing behavior. ### I. INTRODUCTION FULL adders as the core of arithmetic building blocks have a fundamental role in the operation of any computer system, from the simplest controller to the most complex processors [1]. From the past several years, the full adder is a focal concentration area for arithmetic blocks, especially for different applications such as portable devices and IoT [2], [3]. A full adder can be designed using several logic styles, where each style has its advantages and disadvantages. The most popular logic style in very large scale of integration circuit design is conventional CMOS (C-CMOS) which has pull-up and pull-down transistor networks. The pull-up network is based on series and parallel pMOS transistors, and pull-down network is based on series and parallel nMOS transistors. The C-CMOS style full adder is the general CMOS structure with conventional pull-up and pull-down networks providing good drive capabilities and full output swing. On the other hand, Manuscript received May 31, 2018; revised November 2, 2018; accepted December 18, 2018. (Corresponding author: Tooraj Nikoubin.) The authors are with the Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX 79409 USA (e-mail: hareesh-reddy.basireddy@ttu.edu; karthikeya.challa@ttu.edu; t.nikoubin@gmail.com). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2018.2889833 C-CMOS circuits cause more short circuit current and more dynamic current at switching time, which causes more power consumption in comparison with hybrid logic ones. In general, hybrid structure circuits have less connection to the power supply and ground in comparison with C-CMOS circuits. Also, C-CMOS full adder has a large input capacitance because of the number of connections to the pMOS and nMOS transistor which decreases the speed. Hybrid logic style utilizes the characteristics of distinct logic styles for their implementation to enhance the performance of the design [3]-[6]. It is notable that the most popular circuits reported in the hybrid structure are full adders because the complexity of the circuit becomes more dramatically with increasing the block size and the number of inputs and outputs such as compressors, carry save adders, and so on. In the gate level, most popular circuits with hybrid structure are two- and three-input XOR and XNOR circuits which they are the core of full adders as well. Therefore, full adder blocks with different input-output drive conditions have been considered in this investigation. Despite many advantages which are reported for many hybrid full adders like area-power-energy efficiency, noise tolerant, and high speed, the main problem in their utilization is irregularity and complexity of their structure. In this regard, the cell design methodology (CDM) and systematic CDM [6]-[9] are proposed as a systematic approach for the design of the circuits with hybrid structure. These methods are presented in such a way that they can keep their advantages with focus on key points of the circuit design. These key points are a minimum number of transistors on the critical path, splitting the circuit to logic part and drive part, the powerless and groundless design of the logic part, the use of different amend mechanisms for various characteristics, and so on. To have the hybrid logic style, as a reliable logic style like C-CMOS, efficient transistor sizing algorithm is another problem. This mentioned sizing method must be flexible for covering complexity of the circuits with hybrid structure because the conventional logical effort was not working for them. For example, in the conventional logical effort, the parallel transistors can get the similar size based on transistor unit size, and for the series transistors, the size of each transistor depends on the number of transistors on the path drive and transistor unit size. For the circuits with a hybrid structure, this technique is not working, because the circuit structure is more complicated. Simple exact algorithm (SEA) [10] has been proposed as a heuristic method for optimization and sizing of the circuits with hybrid structure. This is efficient sizing algorithm, which Fig. 1. (a) TGA with inputs drive path. (b) TFA. (c) New-14T. (d) C-CMOS. (e) New-HPSC. (f) $C_{in}$ measurement test bench. (g) Delay versus $C_{L}$ for $C_{in}$ calculation. is flexible for covering both of structure complexity and desirable characteristics of the circuit in single-object and multiobject optimizations, but this technique does not work for delay measurement or timing behavioral analysis. The concept of logical effort was first presented by Sutherland *et al.* [11]. The report clearly explains the methodology to model the C-CMOS circuits for both circuit optimization with sizing and delay modeling of single-stage and multistage structures. As recent work in this regard, Lin et al. [12] proposed an improved logical effort model for the circuits operating in multiple supply voltage regimes. All the previously explained works focused on the logical effort of C-CMOS logic style circuits because of their fixed and proper structure. However, for hybrid logic circuits, a new method is required to analyze the circuit behavior because of their complicated structure [13]. Another advantage of this paper is to make clear ideas on how to select and use adder blocks with hybrid structure. For example, some of the full adders are low power or energy efficient when they are working on a single test bench with specific input—output drive conditions. However, they are not working properly in the multistage structures because of the lack of drivability and their big input capacitance. With adding a buffer to compensate their lack of drive, they can function properly, but they could not be an efficient cell because the whole circuit has some overhead of area, power, and energy consumption with extra buffers. This paper is useful for deeply understanding the adder circuits with a hybrid structure to make sure they could be reliable blocks in multistage structures at the same time they are the best for the goal parameters such as power, energy, and so on. This paper has the potential for improving the industrial tools for timing analysis as well. The big question in this paper is what are the critical parameters that designer needs to consider for the circuits with hybrid structure in the single test bench, for extraction of a complete timing behavior of the adder cells? In this regard, a new method is proposed to dissect the timing behavior of the hybrid full adders with the irregular structure, and simplicity is considered as a principle in the proposed method. The transistor function full adder (TFA), transmission gate full adder (TGA), New-HPSC, New-14T (Fig. 1) as most popular hybrid full adders [10] with different input-output drive conditions are selected in comparison with the C-CMOS full adder for running this new method. Since propagation delay is, test bench dependent and its characteristics depend on input and output drive conditions, it could not be a sufficient parameter for representing timing behavior of the cell. Therefore, in the proposed method, three parameters have been considered to estimate the timing behavior of the full adders. These mentioned parameters are switching speed, drive capability, and input capacitance which help designer to predict the performance of the full adder cells in multistage structures by some measurement at the single test bench [Fig. 2(c)]. The ripple carry adder (RCA) and 6:2 compressor are used as the test bench for doing the multistage analysis for carry and sum output signals, respectively. The RCA has been selected because it has a multistage path drive, suitable for timing behavior extraction of the carry signal [Fig. 3(b)]. For extraction of the sum signal timing behavior, the 6:2 compressor has been selected which they cascaded in five-stage structure [Fig. 3(a)]. This analysis is done for all selected full adders. To study the timing behavior, initially, the full adders have been sized properly to optimize the power delay product (PDP) by using the SEA [10] as an efficient algorithm. On scaling down the conventional MOSFET below 20 nm, electrical parameters start to degrade. The FinFET technology is selected as a replacement of Bulk CMOS, which allows transistors to be scaled down further with promising advantages over Bulk CMOS such as high drain current, lower switching voltage, and significantly less static leakage current [14]. FinFETs have proved to be better performing in terms of speed and power because of their outstanding ability to produce more current at smaller dimensions and lesser input voltages, leading to high drive capability compared to the Bulk CMOS technology. Another question in this paper, which we are answering is, how this timing analysis method which contains three mentioned parameters is working on different technologies. Therefore, full adders with FinFET technology are also analyzed, and their timing behavior is compared with the Bulk CMOS full adders. All the simulations in this paper are performed in HSPICE using 32-nm Bulk CMOS and 32-nm shorted gate FinFET models. # II. PROPOSED TIMING BEHAVIOR ANALYSIS This section presents more insights into the proposed logical effort analysis of the hybrid structure circuits which contains the following items. # A. Switching Time The switching time of the circuits is measured at a single test bench ([Fig. 2(c)], when output is connected to the buffer) at two specific input—output conditions. These two items are unit drivability condition of the previous stage and the unit output capacitor. The switching time is propagation delay at the standard input—output drive condition and necessary parameter but not sufficient parameter to understand the timing behavior of the full adder in multistage structures. Hence, the drive capability and input capacitance are considered as essential parameters along with switching time which form an integral part of our timing behavior analysis. In this way, the minimum size of the inverter (W/L = 2 for pMOS and W/L = 1 for nMOS) with consideration of technology limits has been considered as unit drivability Fig. 2. Single stage analysis. (a) Switching time. (b) Drive capability. (c) Single test bench used in simulation. of the previous stage and unit output load on the single test bench. Also, all drivability and input capacitance values are considered based on this standardization. It is notable that these standard and unit values are technology dependent, but they are under the control of designer for any specific W/L aspect ratio as a unit value for consideration. Using the minimum size of the technology as a unit value is one of the best choices, because other values will be normalized with the minimum value. # B. Drive Capability Circuit drivability is the ability of the output of the circuit to drive the certain load properly. Lesser time the circuit takes to charge the output capacitor, more prominent is the drive capability of the circuit. For this analysis, the output is Fig. 3. Multistage test benches for sum and carry signals. connected to a capacitor and delay is measured by sweeping the load capacitor value from 0 to 2 fF with a specific step size of 0.4 fF [Fig. 2(c)], when output is connected to the variable $C_{\rm L}$ ) and load capacitance versus delay line is plotted. For example, Fig. 1(g) depicts the delay versus load capacitance for the unit inverter. The slope of the delay versus load capacitance is calculated [ $\alpha$ in Fig. 1(g)]. The complement of the slope of the line is defined as the circuit drive capability (Dr<sub>0</sub> = 1/ $\alpha$ , fF/ps) in this paper. Similarly, the same method is followed to measure the drive capability for all mentioned full adders. In the conventional logical effort, the gate delay of the C-CMOS circuit is modeled as follows: $$Delay = g.h.b + p. (1)$$ In this delay equation, the "g" is the logical effort. The "g" is the ratio of the input capacitance of the template gate and the reference inverter having a minimum size. The "h" parameter is the electrical effort, which is the ratio of the output capacitance to the input capacitance. The "b" parameter is the branching factor, which represents the number of fanouts that occur within a logic network [15], [16], and the "p" parameter is the parasitic delay, which is the delay of the gate driving without output load [8]. In the proposed analysis delay of sum and carry outputs of a hybrid full adder can be demonstrated as follows: $$TD = TD_0 + (\alpha \times C\text{-Load}).$$ (2) In this delay equation, TD is the delay time, $TD_0$ is the delay time at zero load capacitance (intrinsic delay) which can be matched with "p" in the conventional logical effort, but "h" and "b" are not used in the proposed method. The C-Load is the load capacitance, and " $\alpha$ " is the slope of the delay line versus C-Load variation. The drive capability plays a vital role in the estimation of the full adder performance in multistage structure as it is the factor that decides how fast a particular cell is going to charge the input capacitance of the next stage like a current source and can model both *RC* of the cell plus C-Load. Fig. 4 shows the timing behavior of all mentioned full adders in this paper for five different stages based on (2). ### C. Input Capacitance Input capacitance is an essential parameter for timing analysis and can be matched with "g" in the conventional logical effort. The test bench used is the inverter driving each input of the full adder with the varying load capacitors connected to the sum and carry terminals. The delay offered by the input of the full adder to the driving inverter is calculated, and the physical value of the capacitor that causes the same delay is confirmed as the input capacitance ( $C_{in}$ ) [Fig. 1(f) and (g)]. The largest value of the input capacitances is considered as input capacitance of the full adder. ### III. GRAPHICAL ANALYSIS The graphical analysis has been presented against the complexity of the hybrid circuits and better understanding their structure. With categorization of the circuits based on proposed logical effort, designers have a more clear idea which circuit has potential to work faster in multistage structure and which of them may have lack of the drive and they are not working without extra buffers or inverters. In this paper, for better understanding drive parameters between blocks on the multistage structure, the selected adder blocks are the hybrid full adders with different input-output drive conditions. Therefore, the main idea of this section is which kind of circuit parameters should be considered by the designer to achieve an efficient prediction for the speed of the circuit with a hybrid structure before simulation or physical measurement. In this section, the proposed graphical analysis allows predicting necessary information regarding the drive capability and input capacitance of the circuit by looking at the circuit structure. For example, it can be anticipated that both C-CMOS and New-HPSC full adders offer good and equal drive capability because of the presence of the inherent inverter at the output. The methodology to predict the input capacitance is drawing the drive path from the inputs toward the outputs (Fig. 1). In some full adders, inputs are directly connected to the gate terminal of the transistors or could end up being connected to the gates of the transistors through source-drain connections which fix the input capacitance or end up being connected to the load capacitor which makes the input capacitance varies with the load capacitor. Fig. 1 depicts the full adders with drive path drawn from the inputs toward the outputs. For TGA, in the drive path from the inputs to the sum output, inputs A and B directly connected to the gates of the transistors or end up connecting to the gate terminals through source-drain connections. On the other hand, input C is connected to the output load capacitor through sourcedrain connections in addition to the connection to some gate terminals. Therefore, it can be predicted that input A and input B will have fixed input capacitance with constant values and Fig. 4. Multistage analysis of full adders (a, b, c, d, e, f, g & h) CM1 type full adders, (i, j, k, l, m, n) CM2 type full adders, (o, p) CM3 type full adders, (q, r, s) CM4 type full adders, (t, u, v) compressor analysis after connecting buffer at the output. the input capacitance of input C varies with the load which is not a constant value. Similarly, for the carry output, it is predictable that input B will have fixed input capacitance with a constant value and the input capacitance of inputs A and C will vary with load capacitor which is not a constant value. Based on the drive paths from inputs to the outputs, circuits are categorized into three categories. - CS1: Inputs are directly connected to only gate terminals of the transistor. - 2) CS2: Inputs are directly connected to gate terminals or end up connecting to the gate terminals through the source–drain connections. - 3) CS3: Inputs are connected to the load capacitor through source-drain connections and/or connection to the gate terminals. The path drive for each of the inputs toward the output is shown with three different colors on Fig. 1. Also, for better graphical analysis of the circuits, the number of connections to the gate and drain–source of transistors and connections to the C-Load is specified on Table I. The simulation results of $C_{\rm in}$ measurement for each of the inputs is tabulated on Table II as well. For the CS1-type full adders (C-CMOS), the input capacitance is fixed and can be determined using a simple method like conventional logical effort. For CS2-type full adders TABLE I INPUT CAPACITANCE INPUT CONNECTIONS TO THE GATE, DRAIN/SOURCE, AND C-LOAD | | C-CMOS | | New HPSC | | TFA | | TGA | | New 14T | | | | | | | |-------------------------------------|--------|----|----------|----|-----|----|-----|-----|---------|-----|-----|-----|-----|-----|-----| | Inputs => | Α | В | C | Α | В | С | Α | В | C | Α | В | С | Α | В | C | | # of connections<br>to Gate | 8 | 8 | 6 | 14 | 14 | 6 | 5 | 7 | 2 | 3 | 4 | 2 | 10 | 10 | 2 | | # of connections<br>to Drain-Source | 0 | 0 | 0 | 4 | 4 | 2 | 5 | 2 | 4 | 4 | 0 | 4 | 4 | 4 | 4 | | Connection<br>to C-Load | no | no | no | no | no | no | yes (New-HPSC), the input capacitance is constant but can be determined by using the proposed technique. For CS3-type full adders (TFA, TGA and New-14T), the input capacitance is not constant and varies with the load capacitance. In this case, $C_{\rm in}$ is a function of output load capacitor and has the following equation: $$C_{\rm in} = C_{\rm in0} + \beta \cdot C_{\rm L}. \tag{3}$$ In this equation, $C_{\rm in0}$ is $C_{\rm in}$ value when the output load is zero ( $C_{\rm L}=0$ ), and $\beta$ is transformer coefficient of the output C-load to the input capacitor ( $C_{\rm in}$ ). This transformation coefficient has been measured with the same method as mentioned before for measuring equivalent $C_{\rm in}$ with the corresponding delay [Fig. 1(g)]. The $\beta$ is measurable with the consideration | TABLE II | |--------------------------------------| | INPUT CAPACITANCE SIMULATION RESULTS | | Full Adder | | Input Capacitance (f F) | | | | | | | | | |------------|-------|-------------------------|--------------|--------------|--------------|--|--|--|--|--| | | Input | Bulk | CMOS | FinFET | | | | | | | | Туре | | sum | carry | sum | carry | | | | | | | C-CMOS | Α | 0.857 | 0.857 | 1.01 | 1.01 | | | | | | | | В | 0.815 | 0.815 | 0.94 | 0.94 | | | | | | | | С | 0.618 | 0.618 | 0.82 | 0.820 | | | | | | | New-HPSC | Α | 0.653 | 0.653 | 0.707 | 0.707 | | | | | | | | В | 0.406 | 0.406 | 0.156 | 0.156 | | | | | | | | С | 0.275 | 0.275 | 0.362 | 0.362 | | | | | | | TFA | Α | 0.668 | 0.59+0.3*CL | 0.975 | 0.621+0.3*CL | | | | | | | | В | 0.25+0.2*CL | 0.421 | 0.16+0.4*CL | 1.12 | | | | | | | | С | 0.4+0.25*CL | 0.43+0.15*CL | 0.12+0.45*CL | 0.12+0.45*CL | | | | | | | TGA | Α | 0.953 | 0.89+0.2*CL | 0.870 | 0.73+0.22*CL | | | | | | | | В | 0.566 | 0.566 | 0.240 | 0.240 | | | | | | | | C | 0.44+0.2*CL | 0.51+0.2*CL | 0.27+0.45*CL | 0.27+0.45*CL | | | | | | | New-14T | Α | 0.595 | 0.6+0.5*CL | 0.715 | 0.54+0.45*CL | | | | | | | | В | 0.595 | 0.595 | 0.418 | 0.418 | | | | | | | | С | 0.4+0.5*CL | 0.46+0.5*CL | 0.15+0.55*CL | 0.15+0.55*CL | | | | | | of two points on the delay curve. For CS3 category full adders, the timing behavior can be predicted by making input capacitance constant which can be done by connecting to the output buffer. In the multistage analysis, the performance of any number of stages can be predicted by analyzing the simulation results of only 3 or 4 stages [see graphs in Fig. 4]. Fig. 1(g) shows the delay versus CL, in this graph, " $\alpha$ " is the slop of the delay versus CL line and TD<sub>0</sub> is the intrinsic delay. Also, Fig. 1(f1) and (f2) is the test bench for $C_{\rm in}$ measurement. Fig. 1(f1) shows the delay measurement of the circuit under test and Fig. 1(f2) shows the delay measurement with the $C_L$ variation. Based on the analysis of the delay for the first four stages, circuits are categorized into four following categories. - 1) CM1: Constant $\alpha$ and constant TD<sub>o</sub> for all stages. - CM2: Constant α with predictable variation in TD<sub>o</sub> from stage to stage. - 3) CM3: Predictable variation in *α* and TD<sub>0</sub> from stage to stage. - 4) CM4: Unpredictable variation in $\alpha$ and TD<sub>o</sub> from stage to stage. It is obvious that the best circuit is the one which has constant $\alpha$ and constant $TD_0$ for different stages. This circuit can keep its drivability constantly in multistage structure with the stable timing behavior. ### IV. SIMULATION AND RESULTS ANALYSIS # A. Experimental Setup In this paper, HSPICE is used as circuit simulator for all simulations, 32-nm MOS predictive technology model (PTM) model and 32-nm shorted gate FinFET PTM model are considered as two different technologies for comparison. For single-stage analysis, all full adders have been simulated on a single test bench shown in Fig. 2(c). This single test bench includes input and output buffers. All 56 possible transitions are applied for the delay and power measurement. The rise time and falls time of the input pulses have been considered 10 ps, and power supply has been considered 0.9 V. ### B. Switching Time Results of the switching time for different full adders is shown in Fig. 2(a), which depicts that New-HPSC full adder designed with FinFET technology offer better switching speed for sum output. Also, C-CMOS full adder designed with FinFET technology offer better switching speed for carry output. Full adders designed with FinFET technology provide more than 50% improvement in switching speed compared to the full adders designed in Bulk CMOS technology because of their higher $I_{\rm ON}/I_{\rm OFF}$ ratio. ## C. Drive Capability Outcomes from Fig. 2(b) affirm that the full adders with inverters inherent in their structure like C-CMOS and New-HPSC have a more noteworthy drive capability that is provided by the inverter itself whereas hybrid full adders have lesser drive capability. Fig. 2(b) affirms that FinFET full adders showed more than 40% improvement in drive capability than Bulk CMOS full adders because of their great ability to produce more current at smaller dimensions and less input voltage. # D. Input Capacitance The data prediction on Table I and the simulation results on Table II demonstrate that the input capacitance in some cases did not fluctuate with load because they are directly connected to the gate of the transistors or might end up connecting to the gate of the transistors through the sources or drains. The input capacitance of some of the inputs varies with the variation in the load capacitor because of their direct connection to the load capacitor through drain-source terminals. Results from Table I affirms that C-CMOS (CS1) and New-HPSC (CS2) full adders have fixed input capacitance because of no connection from inputs to the load capacitor. TGA, TFA and New-14T full adders (CS3) have variable input capacitance for some of the inputs because of their connection to load capacitor through some source-drain connections. The prediction analysis matched with simulation results. CS3 category full adders are expected to show different behavior in each stage in the multistage analysis. # V. MULTISTAGE PERFORMANCE PREDICTION PARAMETERS ### A. Gain Excellent drive capability and low input capacitance in single test bench are the coveted parameters to get excellent performance in multistage structure. Gain is the multistage performance prediction parameter which is the ratio of output drive capability over input capacitance. The better the Gain, the better is the circuit's performance in the multistage structure. It is notable that with increasing the size of transistors, the drivability will increase, but this variation has a negative effect on the input capacitance. Since there is no way to control the drivability without impact on the input capacitance, Gain Fig. 5. (a) Gain versus K for sum signal. (b) Gain versus K for carry signal. (c) SF normalized with $10^{30}$ . is the best parameter which can cover the effect of the size on both of drivability and input capacitance simultaneously $$Gain(G) = \frac{Dr_o}{C_{in}}$$ (4) where Dro is the drive capability of each output terminal, and $C_{\rm in}$ is the input capacitance of each input. For gain calculation, simulation results as shown in Fig. 5(a) and (b) for FinFET technology, affirms that the performance prediction factor "G" increases with increasing the size of the transistors, but the rate of variation is more for carry signal in comparison with the sum signal in all full adders. Also, the graphs show that the carry outputs are more sensitive to the variation of the transistor sizes than sum outputs. In this method, coefficient 'K' is used for increasing the size of all circuit transistors simultaneously when the technology limit is considered as the base. Therefore, this parameter can be controlled by the designer for managing the tradeoff between speed and power. Regardless of other characteristics of the cell, like energy consumption, the Gain parameter is the crucial parameter for the proper operation of the cell in the multistage structure. For the better cell, the gain should have the minimum need for the input drive and maximum output drivability. Fig. 5(a) and (b) shows that the New-HPSC circuit is better for the Gain in both of the sum and carry cases. ### B. Selection Factor The selection factor (SF) is introduced as a new parameter in this paper for the wise selection of the cell, which is defined as follows: $$SF = \frac{Gain}{PDP} = \frac{Dr_o/C_{in}}{Power * Td}$$ (5) where Gain is the multistage performance prediction parameter and PDP is the energy consumption. The PDP is used for transistor sizing, to optimize the circuit for both power and delay. The SF is covering both of Gain and PDP parameters, which is a criterion to pick the best circuit based on minimum energy consumption and maximum gain. The SF parameter could be different for sum and carry outputs based on circuit structure. The simulation results for SF of the sum and carry signals for both CMOS and FinFET technologies are shown in Fig. 5(c). The Gain parameter, for some of the full adders like TGA & TFA could be controlled by inverters at the input of the cell. In some other cases like C-CMOS and New-HPSC, Gain could be controlled by inverters at the output of the cell. Some other full adders like New-14T have inverter neither at the input port nor the output port. The simulation results show this circuit does not have a proper operation in multistage structure because of lack of the gain. The buffer is used at the output port to improve the performance of the New-14T full adder, and simulation results as shown in Fig. 5 affirms that New-14T full adder with maximum SF is the best circuit for both of sum and carry outputs after using the output buffer. Equation (5) shows that SF is a function of four different parameters and designer can control the effect of each parameter for different applications. However, it is notable that Fig. 5(c) shows the New-14T as the best circuit if our SF had been formed by (5) which all four parameters have similar weight. It is notable that with increasing or decreasing the weight of each parameter, SF can get different adjustment depends on the application. # C. Cell Characterization With SF In the process of semicustom design with standard cell library, SF is a crucial parameter for cell characterization and library generation. In this paper, SF is defined as a selection or optimization parameter which is a function of four variables (5). These four variables are controllable by cell transistor sizing, but logical effort as conventional transistor sizing method is not working for hybrid structure adders because of irregularity and complexity of their structure. Some other optimization algorithms such as SEA algorithm [10] which are more flexible should be used for transistor sizing. The flexibility should be for target metric selection, devicecircuit cooptimization, different variables for different technologies, multiobject optimization, and circuits with hybrid structures [6], [8], [10]. Traditional standard cell libraries are characterized for five different modes which are five options for the tradeoff between power and speed. These cells are characterized as Typical, Fast-Fast, Slow-Slow, Fast-Slow and Slow-Fast, controllable by the transistor sizing [17]. On the other side, all four variables on the SF (5), $(Dr_o, C_{in}, Power, and Td)$ have a similar weight for regular optimization. At this situation, the result of optimization with transistor sizing may cause a kind of typical cells for both power and speed optimization and proper operation of the cell in multistage structure. The optimization process based on a selection factor is flexible for the designer to play with optimization parameters based on the weight of each SF parameter. Selection factor could be redefined as (6) which has n1, n2, n3, and n4 as four effect factor for four SF parameters. The ni (i = 1, 2, 3, and 4) as the effect factor of each variable can get a value between "0" to maximum a reasonable number like 3. For example, when ni = 1, the related variable has a regular impact on the optimization process and if n > 1, like 1.2 or 1.4, and so on. The mentioned variable has more effect on the optimization process and cell characteristics. The minimum value for *ni* as effect factor is "0," which removes the effect of a related variable from SF and the optimization process completely. At this situation, the optimization process could be based on other three parameters on SF equation. For example, if the n3 value for power is considered as "0," and n4 = 1, the optimization process goes to Fast-Fast corner because the power is not considered on the optimization process $$SF = \frac{Gain}{PDP} = \frac{Dr_o^{n1}/C_{in}^{n2}}{Power^{n3} * Td^{n4}}.$$ (6) In traditional standard cell libraries, in addition to the typical and corners cell characterization, cells may have different categorize for different fan-outs like 1, 2, 4, 8, and so on. However, in the proposed method for cells with hybrid structure, $Dr_0$ and $C_{in}$ on the SF equation are the parameters for characterization of the cell for switching behavior on the multistage structure for fan-out consideration. More drivability and less input capacitor can make the cell faster for switching on the multistage structure. The n1 and n2 are the effect factor for controlling the cell drivability and input capacitor on the transistor sizing process. For example, if $n^3 = 1$ , n4 = 2, and $Dr_0 = C_{in} = 1$ , the optimization process is set for energy-delay product (EDP) with the consideration of output drivability and input capacitance as typical. At this situation, the cell may have best sizes for optimum EDP but this cell does not have the best switching behavior on the multistage structure, this unreliable operation would be worse with increasing the cell fan-out in multistage structure. Increasing the size of transistors on the cell which cause more drivability and more speed has some area overhead that leads to an increase in the $C_{\rm in}$ and decreases the cell speed in another way. Therefore, transistor sizing can control this phenomenon with controlling $C_{in}$ on the SF, along with other parameters. In the proposed method, transistor sizing based on SF can control the Dr<sub>o</sub> and C<sub>in</sub> together for different input pulse transitions. In this case, transition time as one of the cell library characteristics will be covered because it has a direct connection to the charge and discharge of the input capacitors for triggering. The smaller $C_{\rm in}$ and bigger $Dr_{\rm o}$ cause strong cell input trigger for fast switching, and it is like using fast transition time for input pulses. Under this condition, a cell can improve cell switching for high fan-out structures. TABLE III DRIVABILITY AND SWITCHING TIME VARIATION | Full Adder | | Bulk | CMOS | | FinFET | | | | | |------------|----|------|------|------|--------|------|-------|------|--| | Type | sı | ım | ca | rry | su | m | carry | | | | 1,7 PC | Δα | ΔTdo | Δα | ΔTdo | Δα | ΔTdo | Δα | ΔTdo | | | C-CMOS | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | New-HPSC | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | TFA | 5 | 0 | 12.5 | 6 | -0.75 | 0 | -0.75 | 0.5 | | | TGA | 4 | 0 | 15 | 0 | -1 | 0 | -1.25 | 0 | | | New-14T | NA | ### VI. MULTISTAGE ANALYSIS The multistage analysis is performed by using 6:2 compressor [Fig. 3(a)] and 4-bit RCA [Fig. 3(b)]. In 6:2 compressor sum signal, and in 4-bit RCA, carry signals are on the critical path. To better understand the timing behavior of the cells with the increasing number of stages, the transition time for all input pulses have been fixed. The input buffer is used for all input pulses, and the capacitor load has been swept from 0 to 2 fF for each output stage. For single test bench measurements, such as $C_{\rm in}$ measurement, the unit inverter has been used. For multistage analysis, which we need more drivability at the first stage, the buffer with the size of W/L 5/3 and 12/5 [10] has been used, and the delay of the first stage contains the buffer's delay as well. ### A. Full Adders in Bulk CMOS Technology The simulation results of the sum and carry outputs are shown in Fig. 4 (a), (b), (e), (f), (i), (k), (l), (o) and (q) for the full adders. It is obvious from Fig. 4 that C-CMOS and New-HPSC full adders have the same slopes on their output delay lines. This constant slop is because of the output inverters which makes specific drivability. From stage 2 to stage 4, all lines coincide with each other because of having constant $\alpha$ and constant TD<sub>0</sub> for all stages so that we can predict the same behavior for any number of stages. In compressor analysis of TGA and TFA full adders, it is clear that all the four stages have distinctive behavior, but there is a predictable variation in $\alpha$ and constant TD<sub>0</sub> from stage to stage. In the RCA analysis of TGA full adder, there is a predictable variation in $\alpha$ and constant TDo from stage to stage. In the RCA analysis of TFA full adder, there is predictable variation in both $\alpha$ and TD<sub>o</sub> from stage to stage. Therefore, the behavior of C-CMOS, New-HPSC, TFA, and TGA full adders can be predicted for any number of stages. Because of the poor drive and high input capacitance, the performance of the 14T full adder is extremely poor and shows improper outputs in compressor analysis. In the RCA analysis of New-14T full adder, it can be observed that there is an unpredictable variation in $\alpha$ and TD<sub>o</sub> from stage to stage. Therefore, the behavior of New-14T full adder can be predicted by connecting output buffer, but buffer can make some area and power overhead on the cell. Variation in slope and TDo from stage to stage for all mentioned full adders is explained in Table III. From the Table III, it is obvious that C-CMOS and New-HPSC are two circuits without variation on the drivability and inherent delay which can show they have a consistent timing behavior in multistage structure. They can function well on the different structures, and they can keep their speed in various stages. The addition of buffers not only improves the output drive capability but also fixes the input capacitance of the full adders as well. Fig. 3(t)–(v) affirms that $\alpha$ and TD<sub>o</sub> can be made constant for all stages for any circuit by connecting output buffer. # B. Full Adders in FinFET Technology From both compressor analysis and RCA analysis plots [Fig. 4(c), (d), (g), (h), (j), (m), (n), (p), (r), and (s)], it can be observed that all the full adders built using FinFET technology demonstrate the same behavior in comparison with Bulk CMOS technology. For TGA, TFA, and New-14T full adders from stage to stage, the delay decreases as the number of stages increases due to very excellent drive capability and low input capacitance. The analysis in Table II demonstrates that slope decreases predictably from stage to stage for TFA and TGA full adders. # C. Comparison of Bulk CMOS and FinFET technologies Compared results show that the performance of FinFET full adders is superior to full adders designed with Bulk CMOS technology. Full adders designed with FinFET technology offer lesser switching time when compared to the Bulk CMOS full adders because of their high $I_{\rm ON}/I_{\rm OFF}$ ratio. In comparison with Bulk CMOS full adders, FinFET full adders offer greater drive capability with more than 40% improvement because of their ability to produce more current at smaller dimensions with the lesser input voltage. Although the input capacitance is more for FinFET full adders compared to Bulk CMOS full adders, both $DR_{\rm O}/C_{\rm in}$ and Gain/PDP are better for FinFET full adders over Bulk CMOS full adders. ### VII. CONCLUSION In this paper, a new logical effort analysis is proposed for the hybrid structure adder circuits. The big question in this paper, which we are answering is, if hybrid full adders are fast, energy-area efficient more than C-CMOS one, what are the parameters that designer can consider and control to make sure they are reliable blocks to use in the multistage structure in a very simple method like conventional logical effort. This analysis is essential to estimate the performance of the full adders in bigger structures such as big adders, compressors, multipliers and so on. Based on the results of simulation at the single test bench and comparison with multistage analysis for the mentioned full adders, in this paper, their timing behavior can be modeled using: switching time, input capacitance, and output drivability. All these three items are measurable by use of the single test bench, and with these three items, it is possible to predict timing behavior of the circuit for multistage structures. The Gain is defined as a new parameter which is the ratio of output drivability and input capacitance and could be controlled by the transistor sizing. The designer can control the tradeoff between energy and performance of the circuit with the control of the Gain parameter. In this regard, the selection factor is introduced as a design and optimization parameter which is the ratio of the Gain over energy consumption. Also, graphical classification of the hybrid cells is done for bringing up which circuit has more potential to be high performance just with looking at the circuit with any degree of the circuit complexity based on proposed timing analysis method. The predictive analysis is confirmed by doing HSPICE simulation for the selected adder cells and both CMOS and FinFET technologies. #### REFERENCES - [1] C.-H. Chang, J. Gu, and M. Zhang, "A review of 0.18-\(\mu\)m full adder performances for tree structured arithmetic circuits," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 13, no. 6, pp. 686–695, Jun. 2005. - [2] C.-K. Tung, Y.-C. Hung, S.-H. Shieh, and G.-S. Huang, "A low-power high-speed hybrid CMOS full adder for embedded system," in *Proc. IEEE Design Diagnostics Electron. Circuits Syst. (DDECS)*, Krakow, Poland, Apr. 2007, pp. 1–4. - [3] C.-K. Tung, S.-H. Shieh, and C.-H. Cheng, "Low-power high-speed full adder for portable electronic applications," *Electron. Lett.*, vol. 49, no. 17, pp. 1063–1064, Aug. 2013. - [4] S. Goel, A. Kumar, and M. Bayoumi, "Design of robust, energy-efficient full adders for deep-submicrometer design using hybrid-CMOS logic style," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 12, pp. 1309–1321, Dec. 2006. - [5] P. Bhattacharyya, B. Kundu, S. Ghosh, V. Kumar, and A. Dandapat, "Performance analysis of a low-power high-speed hybrid 1-bit full adder circuit," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 23, no. 10, pp. 2001–2008, Oct. 2015. - [6] K. Haghshenas, M. Hashemi, and T. Nikoubin, "Fast and energy-efficient CNFET adders with CDM and sensitivity-based device-circuit co-optimization," *IEEE Trans. Nanotechnol.*, vol. 17, no. 4, pp. 783–794, Jul. 2018. - [7] Y. S. Mehrabani and M. Eshghi, "Noise and process variation tolerant, low-power, high-speed, and low-energy full adders in CNFET technology," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 24, no. 11, pp. 3268–3281, Nov. 2016. - [8] T. Nikoubin, M. Grailoo, and C. Li, "Energy and area efficient three-input XOR/XNORs with systematic cell design methodology," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 24, no. 1, pp. 398–402, Jan. 2016. - [9] T. Nikoubin, M. Grailoo, and S. H. Mozafari, "Cell design methodology based on transmission gate for low-power high-speed balanced XOR-XNOR circuits in hybrid-CMOS logic style," *J. Low Power Electron.*, vol. 6, no. 4, pp. 503–512, 2010. - [10] T. Nikoubin, P. Bahrebar, S. Pouri, K. Navi, and V. Iravani, "Simple exact algorithm for transistor sizing of low-power high-speed arithmetic circuits," in *Proc. VLSI Design*, Jan. 2010, p. 3. - [11] I. Sutherland, B. Sproull, and D. Harris, Logical Effort: Designing Fast CMOS Circuits. San Francisco, CA, USA: Morgan Kaufmann, 1999. - [12] X. Lin, Y. Wang, S. Nazarian, and M. Pedram, "An improved logical effort model and framework applied to optimal sizing of circuits operating in multiple supply voltage regimes," in *Proc. 15th Int. Symp. Quality Electron. Design*, Santa Clara, CA, USA, Mar. 2014, pp. 249–256. - [13] R. M. Anacan and J. L. Bagay, "Logical effort analysis of various VLSI design algorithms," in *Proc. IEEE Int. Conf. Control Syst., Comput. Eng. (ICCSCE)*, George Town, Malaysi, Nov. 2015, pp. 19–23. - [14] A. B. A. Tahrim and M. L. P. Tan, "Design and implementation of a 1-bit FinFET full adder cell for ALU in subthreshold region," in *Proc. IEEE Int. Conf. Semiconductor Electron. (ICSE)*, Kuala Lumpur, Malaysia, Aug. 2014, pp. 44–47. - [15] S. Maheshwari, J. Patel, S. K. Nirmalkar, and A. Gupta, "Logical effort based power-delay-product optimization," in *Proc. Int. Conf. Adv. Comput., Commun. Inform. (ICACCI)*, New Delhi, India, Aug. 2014, pp. 565–569. - [16] R. Uma and P. Dhavachelvan, "Performance evaluation of full adders in ASIC using logical effort calculation," in *Proc. Int. Conf. Recent Trends Inf. Technol. (ICRTIT)*, Chennai, India, Jul. 2013, pp. 612–618. - [17] J. M. Rabaey, A. P. Chandrakasan, and N. Borivoje, *Digital Integrated Circuits: A Design Perspective*, 2nd ed. Upper Saddle River, NJ, USA: Pearson, 2003. Hareesh-Reddy Basireddy received the B.E. degree in electronics and communication engineering from Jawaharlal Nehru Technology University, Anantapur, India, in 2012, the M.S. degree in VLSI design from the Visvesvaraya National Institute of Technology, Nagpur, India, in 2015, and the M.S. degree in electrical engineering from Texas Tech University, Lubbock, Texas, in 2018, respectively. His current research interests include low power and energy-efficient digital very large scale of integration design, and optimization. Karthikeya Challa received the B.E. degree in electronics and communication engineering from Jawaharlal Nehru Technological University, Hyderabad, India, in 2013 and the M.S degree in electrical engineering from Texas Tech University, Lubbock, Texas, in 2015, respectively. He is currently an Electronics Design and Validation Engineer at SL America Corporation, Auburn Hills, MI, USA. His current research interests include low-power and high-performance application-specified integrated circuit design **Tooraj Nikoubin** (M'14–SM'15) received the B.Sc. and M.Sc. degrees in electronic engineering from Electrical, Computer Engineering Department, K. N. Toosi University of Technology, Tehran, Iran, in 1991 and 1994, respectively, and the Ph.D. degree in computer engineering from Shahid Beheshti University, Tehran, in 2009. In 2012, he joined the Electrical and Computer Engineering Department, Texas Tech University, Lubbock, TX, USA. His current research interests include nanoelectronics and energy-efficient com- puting, very large scale of integration circuit and system design and optimization, computer architecture, and wearable electronics for health and assistance.