

# <section-header><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item>























| ole of Software                                         |                                                                          |
|---------------------------------------------------------|--------------------------------------------------------------------------|
| Provides large flex                                     | sibility                                                                 |
| Metric in SW: fun                                       | ctionality, modularity and reusability                                   |
| SW can never imp                                        | prove the energy efficiency, it can just enable it                       |
| Reality: SW often                                       | disables energy efficiency                                               |
| SW implementation<br>Dedicated HW im<br>E.g. MPEG decod | on on DSP processor: 25W<br>plementation: 0.2-0.5W<br>ling: HW ⇔ SW ⇔ HW |
|                                                         | do off Floribility / Enormy                                              |

















# 





# **Further Lessons for Energy**

- In reality only some discrete voltages possible
- Any voltage change implies overhead (DC/DC converter, PLL)
  - Latency: x 1.000 cycles
  - Energy overhead

# **Multi-Core architectures**

- Processor core energy (performance) is often not dominating
  E.g. INTEL 48 core computer
  - Maximum Speed: Cores@1GHz, NoC@2GHz ⇒125W@1.14V@50C°: 69% cores, 30% NoC and DRAM interface
  - Low Power Mode: Cores@125MHz, NoC@255MhZ
    ⇒ 25W@0.7V@50C°: 21% cores, 70% NoC and DRAM interface

© N. Wehn





| W ML     | TE timing with Burst length=4             |
|----------|-------------------------------------------|
| CLK      | hanhannaha                                |
| CMD      | ACT 15ns WR 15ns                          |
| ADDR     |                                           |
| DOS      |                                           |
| DATA     | 00,01,02,03                               |
| wordline |                                           |
| REAT     | D timing with CI = 3 and Burst length = 4 |
|          |                                           |
| CLK      |                                           |
| CMD      | ACTX 15ns XRD X 15ns XPREX                |
| ADDR     | ROW                                       |
| DQS      | RAS to CAS=3*CLK CAS latency=3*CLK        |
|          |                                           |
|          |                                           |
| DATA     |                                           |





















# **Observations**

# **Communication vs Computation**

- $\mathbf{E}_{\text{compute}} \sim 2nJ/\text{operation}$
- **E**<sub>send</sub> ~ 230nJ/useful bit
- $\Rightarrow E_{send}(127 \text{ bytes}) \sim E_{uC}(100.000 \text{ cycles})$  $E_{send}(1 \text{ bit}) \sim 100...4000 \text{ x } E_{compute}(1 \text{ instruction})$

### **Computation vs Flashstorage**

•  $E_{\text{flash write}}(127 \text{ bytes}) \sim E_{uC}(300.000 \text{ cycles})$ 

### **Communication**

 $\bullet P_{\text{receive}} \sim P_{\text{transmit}}$ 

- ⇒ Large energy for ACK based protocols
  - E.g. frame length with 60 bytes
  - Energy\_RX\_ACK/total\_energy: 80% (10ms), 30% (0.5ms)

© N. Wehn



| Method                                                                                        | # of sent<br>Messages | Ø Frames /<br>succ. Message<br>(#ARQ) | Energy / succ<br>Message [µ]] |
|-----------------------------------------------------------------------------------------------|-----------------------|---------------------------------------|-------------------------------|
| <b>Only ARQ</b><br>Battery fully depleted after 48 hours<br>Extrapolated to a runtime of 120h | 431,906               | 2.34                                  | 346,049                       |
| ARQ + Rep 1/3                                                                                 | 431,737               | 1.18                                  | 16,842                        |
| ARQ + Turbo-Code                                                                              | 431,728               | 1.12                                  | 11,798                        |

© N. Wehn









