# MATLAB INTERFACE FOR POWER AND TIMING ANALYSIS OF DIGITAL CIRCUITS

Gaurav Verma<sup>1</sup>, Sourav Abhishek<sup>2</sup>, Abhishek Chauhan<sup>3</sup>, Anushka

## Singh<sup>4</sup>, Manya Mehta<sup>5</sup>

Department of Electronics & Communication Engineering, Jaypee Institute of Information Technology, A-10, Sector-62, Noida (U.P.), India.

*Abstract*: Due to advent in CMOS technology, it has become possible now to put millions of transistors on a single chip of silicon. This has drastically increased the performance of the device and it can do much faster operations. But on the other side, putting more transistors on a silicon chip triggering the problem of increased power consumption. So, it becomes a bottleneck for the designer to choose in between performance and power consumption. Particularly, for reconfigurable hardware like FPGAs the situation is worst and demands concern. So, this paper presents some optimization techniques that are applied on FPGAs at different levels of abstraction. Some benchmark circuits like ALU, Register, Counter and RAM are used for experimental measurements to validate the results. After simulation and power analysis of benchmark circuits at different frequencies, a power aware utility software is developed that performs optimization of power keeping performance in consideration at a given frequency for the selected FPGA. The circuits have been implemented using VHDL as the hardware description language and simulation is carried out using Xilinx ISE 14.1 by targeting Virtex-4, 5 and Artix-7 FPGA.

Keywords: Xpower, STA, VHDL, FPGA, I/O Standards.

## 1. Introduction

Since mid-80s, when the FPGAs were first introduced, their popularity has grown very quickly, and now they accounts for more than half the 3 billion dollar programmable logic industry. FPGAs are the programmable logic devices that can easily implement digital circuits having millions of gates and can operate at speeds in the hundreds of megahertz. Custom ASICs are generally compared with FPGAs and referred as their primary competitor. But these need number of weeks or months for their fabrication. On the other hand, FPGAs can be programmed in seconds and can be used any number of times. So it is a key advantage of FPGAs over ASICs as they reduce time to market, which is the crucial requirement nowadays in the electronic industry for the development of new products. But since they consume large amount of power, it has become a challenge for the designers to reduce this power dissipation while maintaining performance efficiency. Since its circuitry involves significant hardware overhead thus it includes a number of interconnects and programmable switches. There are some generic logic structures in FPGAs that consume more power than the circuitry components used in earlier ASICs. Thus power has been called as a limiting factor in the ability of FPGAs for replacing ASICs. In order to continue this competition of FPGA with Application Specific Integrated Circuits, we must handle this problem using power optimization techniques

keeping performance maintained. Thus we are motivated to do this power and timing analysis using power minimization techniques along with consideration of timing constraints. We have done the analysis on some benchmark circuits by applying some power reduction techniques at different operating frequencies from processor point of view. Power Aware Utility software has been developed and a GUI interface is provided to have user friendly environment.

#### 2. Background and Related Work

Power utilization in CMOS circuits can be named either dynamic or static. Static power is otherwise called leakage power, and is dispersed when a logic circuit is in a quiet state. The essential leakage systems in a MOS transistor incorporate sub-edge leakage, door oxide leakage, and intersection leakage. Dynamic power, on the opposite side, is because of the logic switches that happen on the signs of a logic circuit. Such switches happen as a typical piece of valuable calculation, and dynamic power scales in extent to the rate of calculation. Dynamic power is devoured through two systems: s.c. current and the charging and releasing of capacitance. We are concentrating on dynamic power here since it includes real part of the total consumed power.

#### 3. Literature survey

Lamoureux et al.[1] give an overview of low power techniques from FPGA point of view. They described FPGA architecture and the basics of power dissipation in it. They have covered system level and device level design techniques that mainly targeted on commercial applications. Anderson [2] presented a number of power optimization and prediction techniques for FPGA for dynamic as well as leakage power. Here two CAD techniques for leakage power are proposed which reduce leakage power without imposing any cost and having no effect on area, efficiency, cost for fabricating IC and its speed also. Khaleel et al.[3] have focused on making an efficient binary coded decimal digit adder. Where the author proposed two designs using VHDL and Xilinx ISE 10.1 targeting Xilinx virtex-5 XC5VLX30-3 FPGA. First one involves minimizing the delay of the adder and it was discussed using CAD flow i.e generation of truth table and Boolean expression which was then expressed in VHDL code and simulated. In second design, area efficiency is discussed using Look Up Tables. Huda et al.[4] discussed about the clock gating architectures and how these can be used for power reduction with the help of flexible placement algorithm which is operated with various gating granularities. Pandey et al.[9] gives three power decrease methods clock gating, clock empower and blocking input. These are utilized to power decrease utilization by turning off part of a framework or switch amongst dynamic and standby mode or incapacitating the information way of the circuit. Clock Gating is connected to decrease dynamic power utilization of target plan which is 90 nm Spartan-3. Pandey et al. [10] dealt with mapping which means optimizing one line code. For low power design, low fan out Clock Enable is necessary. There are two strategies for this. In the first place is to utilize union credits to control the utilization of control signs at the flag or module level for low power outline. Second is coding (behavioral HDL or dataflow HDL) for low fan out Clock Enable. Madhok et al. [12] have used capacitance scaling as power reduction technique with different I/O standards like HSTL\_I, LVTTL, LVCMOS33, SSTL15 for designing bio-medical wrist watch on 28 nm FPGA which supports Internet of

### Gyancity Journal of Engineering and Technology, Vol.3, No.1, pp. 8-16, January 2017 ISSN: 2456-0065 DOI: 10.21058/gjet.2017.31002

things. Aggarwal *et al.* [13] designed a Green and Energy Efficient ECG machine on FPGA using different logic families. Gupta *et al.* [14] presented counter design which is based on voltage scaling to improve the energy efficiency. Verma *et al.* [15] used the thermal aware approach in RAM design and thus tested the thermal stability at different ambient temperatures. Also he checked the compatibility of the device with wireless network by carrying out the experiment at different processor frequencies. Mishra *et al.* [16] dealt BCD Adder for power efficiency at architectural level through techniques like pipelining and parallelism. Verma *et al.* [17] presented low power techniques which can be applied for different target platforms at different levels of design hierarchy. Verma *et al.* [18] performed power analysis by performing scaling of parameters like voltage, frequency and load capacitance and airflow for power optimization.

#### 4. Low power Techniques used

#### A. Clock Gating

Reduction of dynamic power can be achieved by clock gating technique after turning off the inactive parts temporarily or by putting unused modules in standby mode. This is a simple technique in which *clock enable* is used. Clock gating [4] can be utilized for controlling switching activity at the function unit level, if we inhibit the input updates to the function units whose outputs are not needed for a given operation.



Fig. 1(a) Without Clock Gating

#### (b) With Clock Gating

#### B. Selection of I/O Standards

The Virtex<sup>™</sup> FPGA series provide a means to select from a wide variety of I/O standards [5][6][7][8] by including a highly configurable and high performance I/O resource within it. The selection of these I/O standards has a great impact on power dissipation. This selection has been done by using UCF File i.e User Constraints File which actually helps in providing a complete and realistic set of constraints for utilisation and performance of I/O logic and Core logic both. There are a number of I/O standards that are supported by Virtex devices like LVTTL, LVCMOS, HSTL, PCI etc. but we have used three of them for analysis purpose that are LVCMOS, HSTL and SSTL.

#### 1. LVCMOS (Low Voltage Complementary Metal Oxide Semiconductor)

It is widely used as a switching standard implemented in CMOS transistors, defined by JEDEC (JESD 8-5). The LVCMOS standards supported by Virtex-4 FPGAs are: LVCMOS33, LVCMOS15, LVCMOS18 and LVCMOS25

## 2. HSTL (High Speed Tranceiver Logic)

The HSTL standard is a general purpose high-speed bus standard sponsored by IBM (EIA/JESD8-6). The following classes are defined for HSTL -

- 1. Class I (unterminated, or symmetrically parallel terminated)
- 2. Class II (series terminated)
- 3. Class III (asymmetrically parallel terminated)
- 4. Class IV (asymmetrically double parallel terminated)

#### 3. SSTL (Stub Series Terminated Logic)

Stub Series Terminated Logic (SSTL) is a group of electrical standards which can be used to drive transmission lines. These are commonly used with DRAM based DDR memory IC's and memory modules. Four voltage levels for SSTL are defined here:

- 1. SSTL\_3, 3.3V, defined in EIA/JESD8-8 1996.
- 2. SSTL\_2, 2.5 V, defined in EIA/JESD8-9B 2002.
- 3. SSTL\_18, 1.8 V, defined in EIA/JESD8-15A.
- 4. SSTL\_15, 1.5 V.

The Digitally Controlled Impedance(DCI) specifications of all I/O standards are also used. DCI adjusts the characteristic impedance of the transmission line to accurately match the output impedance or the input termination. DCI also adjusts the impedance of the input/output to make it equal to an external reference resistance. Because of this, the changes in input/output impedance, due to process variations, are compensated. The changes in the input/output impedance can also compensate for the variations of temperature and supply voltage fluctuations.

#### C. Logic level power Minimization through coding

At logic level, power can be reduced by reducing the number of transitions i.e switching activity, at the I/O interface of processor. One approach to reduce switching is to perform suitable encoding of the data before sending it over the I/O interface and a decoder should be used to get back the original data at the receiving end. The input is applied in different coding ways and then the power is calculated using the SAIF file generated. XPower analyzer generates power report using this file. The coding styles used are binary coding, grey coding, silent coding and bus-inverse coding.

#### 1. Binary Coding

Binary Coding is a coding scheme for representing a number to the base 2. Here, each place of a number corresponds to a power of 2. It uses only the digits 1 and 0.

## 2. Grey Coding

Gray Coding produces a code word sequence where adjacent code words differ only by 1 bit i.e with the hamming distance of 1. For large values of n, the number of transitions for binary representation will approach 2. Contrary to this, the number of transitions for grey code will always have 1 transition.

## 3. SILENT Coding

This is a serialized low energy transmission technique for reducing the transmission energy to the minimum on the serial wire [11]

b<sup>(t)</sup> [n-1: 0] represents n-bit data word from a sender at time t.

B<sup>(t)</sup>[n-1:0] represents n-bit encoded data word at time t. The encoder works as follows:

$$B^{(t)}_{[i]} = b^{(t)}_{[i]} XOR b^{(t-1)}_{[i]}$$
 for  $i = 0 - n-1$ 

By serializing these encoded words, we can reduce the number of transitions of the serial wire and the wire looks silent.

## 4. Bus Inverse Coding

Bus Inverse Coding is a coding scheme, which demands only 1 repetitive bit i.e. m = n+1 for the transmission of information words. On the off chance that where the hamming separation amongst Data(t) and Bus(t-1) is more than n/2, supplement of Data(t) is sent by the Encoder over the transport as B(t). Then again when the hamming separation amongst Data(t) and Bus(t-1) is not exactly or equivalent to n/2 then Data(t) is sent as B(t) by the encoder. The repetitive bit P is added to demonstrate whether B(t) is a rearranged rendition of Data(t) or not. By utilizing this encoding strategy, the quantity of moves is lessened by 10-20% for information transports. So this coding helps in reducing power dissipation.

## 5. Power Aware UTility

A Graphical User Interface which incorporates a power aware utility algorithm has been developed using MATLAB R2012b. The algorithm works on the data values obtained after synthesizing benchmark circuits for different mix of configurations. Around 500 simulations run have been performed and the observed data has been put in the form of look up tables. The algorithm basically calculates the optimized power in watts for the selected digital circuit at a particular frequency for a specific FPGA as per the flowchart shown in fig.2. The advantage of using this software is that it takes very less time to perform power optimization for all the power reduction techniques along with the consideration of timing i.e here timing constraints are also kept in this analysis. In case of proper matching of timing constraints, technique is considered for optimization and accordingly output i.e minimum power value is calculated and displayed along with reduction in power in terms of percentage as well as the technique by which we have got the minimum power.

#### Gyancity Journal of Engineering and Technology, Vol.3, No.1, pp. 8-16, January 2017 ISSN: 2456-0065 DOI: 10.21058/gjet.2017.31002



Fig.2 Flowchart for power aware algorithm

After clicking on the optimize button the results will be displayed as shown in the Fig. 3.

| J GUI_final_project |                     |
|---------------------|---------------------|
|                     |                     |
| - INPUTS            | Outputs             |
| Virtex-4            | 0.513               |
|                     | Power Reduction (%) |
| Register            | 62.5274             |
| 1.2 GHz •           | Technique Used      |
|                     | LVCMOS15            |
| Optimize            |                     |

Fig. 3 Graphical User Interface

## 6. Results & Discussion

We have observed different results for three FPGA packages for the techniques used at different frequencies. In case of Virtex-4, we have seen that the optimized power is given by Gray Encoding for ALU Circuit among all the four frequencies. For Register circuit, we have got minimum power by applying LVCMOS15 standard at all three frequencies i.e 72 MHz, 1.2 GHz and 1.3 GHz, while at 100 MHz optimization is done by Clock Gating. For the case of

#### Gyancity Journal of Engineering and Technology, Vol.3, No.1, pp. 8-16, January 2017 ISSN: 2456-0065 DOI: 10.21058/gjet.2017.31002

Counter and RAM circuits, the best result has been given by Clock Gating at 72 MHz and 100 MHz. But these two circuits cannot be used at 1.2 GHz and 1.3 GHz since timing constraints are not matched for any technique and thus no optimized Power has been calculated.

| Tuble 1. Villex 4 Results for Optimized 1 ower |                    |   |            |          |            |      |         |
|------------------------------------------------|--------------------|---|------------|----------|------------|------|---------|
| Circuit                                        | F1 = 72 $F2 = 100$ |   | )          | F3 = 1.2 |            | F4 = |         |
|                                                | MHz                |   | MHz        |          | GHz        |      | 1.3     |
|                                                |                    |   |            |          |            |      | GHz     |
|                                                | Power in           | I | Power in P |          | Power in P |      | ower in |
|                                                | (Watt)             |   | (Watt)     | /att) (  |            |      | (Watt)  |
| ALU                                            | 0.406              |   | 0.407      |          | 0.424      |      | 0.426   |
| Register                                       | 0.447              |   | 0.449      |          | 0.513      |      | 0.519   |
| Counter                                        | 0.403              |   | 0.404      |          | -          |      | -       |
| RAM                                            | 0.400              |   | 0.400      |          | -          |      | -       |
|                                                |                    |   |            |          |            |      |         |

Table 1: Virtex-4 Results for Optimized Power

For the case of Virtex-5, all the four circuits show positive timing scores for frequencies 1.2 GHz and 1.3 GHz i.e these circuits cannot be used at these frequencies. So at the two frequencies 72 MHz and 100 MHz, the three circuits' i.e ALU, Register and RAM give optimized power by Gray Encoding scheme while Counter Circuit gives minimum power using LVDCI\_15 technique.

| Table 2. Vintex-5 Results for Optimized Power |          |          |          |          |
|-----------------------------------------------|----------|----------|----------|----------|
| Circuit                                       | F1 = 72  | F2 = 100 | F3 = 1.2 | F4 = 1.3 |
|                                               | MHz      | MHz      | GHz      | GHz      |
|                                               | Power in | Power in | Power in | Power in |
|                                               | (Watt)   | (Watt)   | (Watt)   | (Watt)   |
| ALU                                           | 0.324    | 0.325    | -        | -        |
| Register                                      | 0.324    | 0.324    | -        | -        |
| Counter                                       | 0.505    | 0.508    | -        | -        |
| RAM                                           | 0.329    | 0.332    | -        | _        |

Table 2: Virtex-5 Results for Optimized Power

| Circuit  | F1 = 72  | F2 = 100 | F3 = 1.2 | F4 = 1.3 |
|----------|----------|----------|----------|----------|
|          | MHz      | MHz      | GHz      | GHz      |
|          | Power in | Power in | Power in | Power in |
|          | (Watt)   | (Watt)   | (Watt)   | (Watt)   |
| ALU      | 0.058    | 0.058    | -        | -        |
| Register | 0.054    | 0.055    | -        | -        |
| Counter  | 0.056    | 0.057    | -        | -        |
| RAM      | 0.054    | 0.055    | -        | -        |

Table 3: Artix-7 Results for Optimized Power

In Artix-7, all the four circuits show positive timing scores for frequencies 1.2 GHz and 1.3 GHz and thus cannot be used at these frequencies. So ALU Circuit gives optimized power at 72 MHz by LVCMOS15 technique and at 100 MHz by Gray Encoding scheme, Register circuit gives its best power results by LVCMOS15 and Clock Gating techniques at 72 and 100 MHz respectively, while Counter and RAM Circuits give the optimization by LVCMOS15 technique at both these frequencies i.e 72 MHz and 100 MHz.

## 7. Conclusion

After observing the results, we can conclude that Artix-7 gives us the best optimized results for all digital circuits out of all the three FPGA packages at frequencies 72 MHz and 100 MHz but it cannot be used at 1.2 GHz and 1.3 GHz as timing constraints are not met for these two cases.

## 8. Future Scope

This paper provides a wider scope for refinement in the foreseeable future. Further analysis of power can be done by considering other benchmark circuits. Not only Timing score but there are other performance parameters also which can be involved. Thus the software can be generalize by considering different FPGA packages with different power and timing techniques involved into it.

## References

- [1] Julien Lamoureux and Wayne Luk, "An Overview of Low-Power Techniques for Field-Programmable Gate Arrays", Imperial College London In: *NASA/ESA Conference on Adaptive Hardware and Systems*, IEEE, pp. 338–345, 2008.
- [2] Jason Helge Anderson, "Power Optimization and Prediction Techniques for FPGAs," PhD thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada, 2005.
- [3] O.D. Al-Khaleel, N.H. Tulie, and K.M. Mhaidat, "FPGA Implementation of Binary Coded Decimal Digit Adders and Multipliers" 8th International Symposium on Mechatronics and its Applications (I SMA), 20 1 2, pp. 1 -4.
- [4] Safeen Huda, Muntasir Mallick, Jason H. Anderson, "Clock Gating Architectures for FPGA Power Reduction", Department of Electronics and Communication Engineering, University of Toronto, Toronto, Ontario, Canada.
- [5] Xilinx, Using the Virtex select I/O resource, XAPP133 (v2.7) June 9, 2005.
- [6] Xilinx, Virtex-6 FPGA Select I/O Resources, User Guide, UG361 (v1.4) June 21, 2013.
- [7] Xilinx, Virtex-5 Family Overview, DS100 (v5.0) February 6, 2009.
- [8] Xilinx, XA Artix-7 FPGAs Overview, DS197 (v1.1) October 10, 2014.
- [9] Bishwajeet Pandey, Pandey B., Yadav J., Rajoria N., Pattanaik, M., "Clock Gating BasedEnergy Efficient ALU Design and Implementation on FPGA", IEEE International Conference on Energy Efficient Technologies for Sustainability-(ICEETs), 2013, pp.93-97, 2013.
- [10] Bishwajeet Pandey, Manisha Pattanaik, " Low Power VLSI Circuit Design with Efficient HDL Coding", 2013 International Conference on Communication Systems and Network Technologies.
- [11] Kangmin Lee, SeJoong Lee, and HoiJun Yoo, "SILENT Serialized Low Energy Transmission Coding for On- Chip Interconnection Networks".
- [12] Shivani Madhok, Gaurav Verma, Ankur Bhardwaj, Himanshu Verma, Ipsita Singh, Sushant Shekhar "Capacitance Scaling With Different IO Standard Based Energy Efficient Bio-Medical Wrist Watch Design on 28nm FGPA", International Journal of Bio-Science and Bio-Technology, Vol. 7, No. 4, August 2015.
- [13] S. Aggarwal, G. Verma, R. Kumar, A. Kaur, B. Pandey, S. Singh and T. Kaur, "Green ECG Machine Design Using Different Logic Families " in Proceedings of IEEE International Conference on Communication Systems and Network Technologies (CSNT-2015), pp. 830-833, April 4-6, 2015.

- [14] T. Gupta, G. Verma, A. Kaur, B. Pandey, A. Singh and T. Kaur," Energy Efficient Counter Design Using Voltage Scaling On FPGA" in Proceedings of IEEE International Conference on Communication Systems and Network Technologies (CSNT-2015), pp. 816-819, April 4-6, 2015.
- [15] G. Verma, A. Moudgil, K. Garg and B. Pandey," Thermal and Power Aware Internet of Things Enable RAM Design on FPGA " in Proceedings of IEEE International Conference on "Computing for Sustainable Global Development", pp. 1537-1540, 11th – 13th March, 2015.
- [16] S. Mishra, G. Verma, "Low Power and Area Efficient Implementation of BCD Adder on FPGA" in Proceedings of IEEE International Conference on Signal Processing and Communication (ICSC-2013), pp. 461-465, December 12-14, 2013.
- [17] G. Verma, M. Kumar, V.Khare "Low Power Techniques for Digital System Design", Indian Journal of Science & Technology, Vol. 8, Issue 17, IPL063, August 2015.
- [18] G. Verma, S. Mishra, S. Aggarwal, S. Singh, S. Shekhar, S. K. Virdi "Power Consumption Analysis of BCD Adder using XPower Analyzer on VIRTEX FPGA", Indian Journal of Science & Technology, Vol. 8, Issue 18, IPL160, August 2015.
- [19] D. Sharma, A. Bhardwaj, H. Prasad, J. Kandpal, A. Saxena, K. S. Kant, G. Verma, "Design of Low Power and Secure Implementation of SBox and Inverse-SBox for AES", International Journal of Security and Its Applications, vol 10, No. 7, pp. 11-24, August 2016.
- [20] G. Verma, S. Maheswari, S. K. Virdi, N. Baishander, I. Singh and B. Pandey, "Low Power Squarer Design Using Ekadhikena Purvena on 28nm FPGA" International Journal of Control and Automation, vol 9, No.5, pp. 281-288, May 2016.
- [21] G. Verma, S. Shekhar, K. S. Kant, V. Verma, H. Verma and B. Pandey, "SSTL IO Standard Based Low Power Arithmetic Design Using Calana Kalanabhyam On FPGA" International Journal of Control and Automation, vol 9, No.4, pp. 271-278, April 2016.
- [22] G. Verma, V. Verma, D. Sharma, A. Kumar, H. Verma and K. Kalia, "Design Goal Based Implementation of Energy Efficient Greek Unicode Reader for Natural Language Processing" International Journal of Smart Home, vol 10, No.3, pp. 181-190, March 2016.