# Energy Efficient and Runtime based Approximate Computing Techniques for Image Processing Applications: An Integrated approach covering Circuit to Algorithmic Level HUANG JUNQI supervised by Dr. T.Nandha Kumar & Prof. Haider Abbas Almurib



## Introduction

Approximate computing has been widely used in error resilient design for improving the energy performance by reducing circuit complexity and allowing circuits to produce acceptable error results (approximation).

and Approximate computing techniques have been developed implemented either at algorithmic level or logic level or circuit level and with no feasibility of on-the-fly or Runtime change of approximation.

Different from existing methods, the work in this poster presents novel energy-efficient, area-efficient, latency-efficient and technology independent integrated approach of implementing approximate computing from circuit level to the algorithmic level that incorporate the change of approximation for a given circuit at runtime without incurring any extra hardware requirement.

When FUS and VOS are applied to ZLCADCT, approximate full adder sustains significant higher frequency (around 19.2GHZ) and lower supply voltage (around 0.77v) when compared with an exact full adder (around 15.4GHZ and 0.83v) without having significant decreases in PSNR value.

An increase in NAB (number of FUS/VOS approximate adder cells applied to a Ripple Carry Adder in DCT) results in a significant increase in the number of completed DCT operations (NCDO) using FUS and a significant decrease in energy dissipation using VOS. Also, ZLCADCT significantly rises NCDO using FUS and reduces energy dissipation using VOS when compared with ADCT.





The two new techniques presented are known as Frequency upscaling (FUS) technique and Voltage overscaling (VOS) technique. The change of approximation (errors) of a given circuit is realized in runtime through controlling the operating frequency and supply voltage on the circuit without the need to modify or include additional circuits.

Meanwhile, these two new techniques developed for the logic/circuit level abstract are integrated into a new proposed algorithmic level approximate computing technique known as zigzag low-complexity approximate DCT (ZLCADCT).

### Method

FUS: frequency of the input values applied to an exact and approximate (AMA1) full adder cell is increased (upscaled) beyond its maximum operating value thereby generating errors in the addition operation and at the same time increasing the computational throughput.



VOS: Supply voltage of exact and approximate adder cells is scaled down below the nominal voltage such that the output delay increases beyond worst-case delay thereby generating errors for addition results while reducing energy dissipation.

Figure 2: PSNR and output images for ZLCADCT using VOS (*vo1>vo2>vo3*)

| Table 1: NCDO for 1 | DCT using | FUS |
|---------------------|-----------|-----|
|---------------------|-----------|-----|

Table 2: Energy dissipation for DCT using VOS

| 7      | Number of completed |       | Frequency                  |      | Total Energy |                 | Applied supply voltage |                |                |                |      |      |
|--------|---------------------|-------|----------------------------|------|--------------|-----------------|------------------------|----------------|----------------|----------------|------|------|
| y<br>e | DCT operations      |       | (f1)                       | (f2) | (f3)         | Dissipation(nJ) |                        | ( <i>vo1</i> ) | ( <i>vo2</i> ) | ( <i>vo3</i> ) |      |      |
|        | ADCT                | Exact | NAB=2                      | 2.48 | 2.50         | 2.89            |                        | Exact          | NAB=2          | 1.24           | 1.15 | 1.11 |
|        |                     | full  | NAB=4                      | 2.48 | 2.63         | 3.75            |                        | full           | NAB=4          | 1.26           | 1.09 | 0.95 |
| e      |                     | adder | NAB=6                      | 2.48 | 3.01         | 8.33            | ADCT                   | adder          | NAB=6          | 1.31           | 1.01 | 0.82 |
|        |                     | AMA1  | NAB=2                      | 2.77 | 2.86         | 3.02            |                        |                | NAB=2          | 1.07           | 1.01 | 1.01 |
|        |                     |       | NAB=4                      | 3.10 | 3.42         | 4.15            |                        |                | NAB=4          | 0.87           | 0.81 | 0.77 |
|        |                     |       | NAB=6                      | 3.36 | 4.36         | 9.97            |                        |                | NAB=6          | 0.69           | 0.65 | 0.56 |
| C      | ZLCA<br>DCT         | Exact | Exact NAB=2 3.44 3.58 3.94 |      | Exact        | NAB=2           | 0.88                   | 0.81           | 0.78           |                |      |      |
|        |                     | full  | NAB=4                      | 3.44 | 3.88         | 5.05            |                        | full           | NAB=4          | 0.88           | 0.75 | 0.67 |
|        |                     | adder | NAB=6                      | 3.44 | 4.52         | 10.88           | ZLCA                   | adder          | NAB=6          | 0.89           | 0.69 | 0.58 |
|        |                     | AMA1  | NAB=2                      | 3.75 | 3.86         | 4.10            | DCT                    | AMA1           | NAB=2          | 0.79           | 0.75 | 0.74 |
|        |                     |       | NAB=4                      | 4.14 | 4.55         | 5.54            |                        |                | NAB=4          | 0.67           | 0.61 | 0.58 |
|        |                     |       | NAB=6                      | 4.50 | 5.66         | 12.91           |                        |                | NAB=6          | 0.54           | 0.50 | 0.43 |

#### Conclusion

FUS and VOS optimize the change of approximation of adder cell at runtime without incurring extra hardware requirement while increasing the processing speed and decreasing the energy dissipation. Also, ZLCADCT reduces the number of additions required by ADCT. Both FUS and VOS can be integrated into ZLCADCT to realized optimized approximation

ZLCADCT: It is a deterministic technique that accurately configures the size of transform matrix (T) according to the number of retained coefficients in the zigzag scanning process. This is achieved by establishing the relationship between the number of retained coefficients and the number of rows of 'T' matrix.

FUS/VOS at circuit level are applied into ZLCADCT at algorithmic level to realize optimized approximation as a complete system.

### Results

Frequency of approximate adder cell using FUS technique can be increased to 1.4 times (11.49GHz to 16.6GHz) for the minimum approximation (2 errors); the maximum approximation (7 errors) is achieved by increasing operating frequency by 2.5 times (29GHz).

On applying FUS technique to an exact adder cell shows that the approximate adder cell sustains a higher (around 1.3 times) frequency for the same approximation and results in 50% reduction in energy dissipation when compared to the exact adder cell.

By applying VOS on both exact and approximate adder cells, the approximate adder cell when compared with exact adder cell, reduces 30% energy dissipation for the maximum approximation.

With VOS, the approximation of the approximate adder cell can be varied from minimum value to maximum value at the runtime by saving the energy dissipation from 31.1% to 87% when compared with the exact adder cell.

In ZLCADCT, approximate full adder can sustain higher input frequency by using FUS and lower supply voltage by using VOS when compared with exact full adder without having significant decreases in PSNR

#### **Related publication**

- 1. H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "A Deterministic Low-Complexity Approximate (Multiplier-Less) Technique for DCT Computation," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 8, pp. 3001-3014, 2019.
- 2. H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "Simulation-based evaluation of frequency upscaled operation of exact/approximate ripple carry adders," in 2017 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), Cambridge, UK, 2017, pp. 1-6.
- H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "Approximate Computing using Frequency Upscaling," IET Circuits, Devices & Systems, vol. 13, no. 7, pp. 1018–1026, 2019.
- 4. H. Junqi, T. N. Kumar, and H. Abbas, "Simulation-Based Evaluation of Approximate Adders for Image Processing Using Voltage Overscaling Method," in 2020 IEEE 5th International Conference on Signal and Image Processing(ICSIP), Nanjing, China, 2020, pp. 1-1.