Improving Energy Efficiency of Application-Specific Instruction-Set Processors
|Kustantaja||Tampere University of Technology|
|Tila||Julkaistu - 24 marraskuuta 2017|
|Nimi||Tampere University of Technology. Publication|
This Thesis aimed to explore the energy efﬁciency overheads of Application-Speciﬁc InstructionSet Processors (ASIPs), a class of embedded processors aiming to compete with ASICs. While an ASIC can be designed to provide precise performance and energy efﬁciency required by a speciﬁc application without unnecessary overheads, the cost of design and veriﬁcation, as well as the inability to upgrade or modify, favour more ﬂexible programmable solutions. The ASIP designs can match the computing performance of the ASIC for speciﬁc applications. What is left, therefore, is achieving energy efﬁciency of a similar order of magnitude.
In the past, one area of ASIP design that has been identiﬁed as a major consumer of energy is storage of temporal values produced during computation – the Register File (RF), with the associated interconnection network to transport those values between registers and computational Function Units (FUs). In this Thesis, the energy efﬁciency of RF and interconnection network is studied using the Transport Triggered Architectures (TTAs) template. Speciﬁcally, compiler optimisations aiming at reducing the trafﬁc of temporal values between RF and FUs are presented in this Thesis. Bypassing of the temporal value, from the output of the FU which produces it directly in the input ports of the FUs that require it to continue with the computation, saves multiple RF reads. In addition, if all the uses of such a temporal value can be bypassed, the RF write can be eliminated as well. Such optimisations result in a simpliﬁcation of the RF, via a reduction in the actual number of registers present or a reduction in the number of read and write ports in the RF and improved energy efﬁciency. In cases where the limited number of the simultaneous RF reads or writes cause a performance bottleneck, such optimisations result in performance improvements leading to faster execution times, therefore, allowing for execution at lower clock frequencies resulting in additional energy savings.
Another area of the ASIP design consuming a signiﬁcant amount of energy is the instruction memory subsystem, which is the artefact required for the programmability of the embedded processor. As this subsystem is not present in ASIC, the energy consumed for storing an application program and reading it from the instruction memories to control processor execution is an overhead that needs to be minimised. In this Thesis, one particular tool to improve the energy efﬁciency of the instruction memory subsystem – instruction buffer – is examined. While not trivially obvious, the presence of buffers for storing loop bodies, or parts of them, results in a reduced number of reads from the instruction memories. As a result, memories can be put to lower power state leading to lower overall energy consumption, pending energy-efﬁcient buffer implementation. Speciﬁcally, an energy-efﬁcient implementation of the instruction buffer is presented in this Thesis, together with analysis tools to identify candidate loops and assess their suitability for storing in the instruction buffer.
The studies presented in this Thesis show that the energy overheads associated with the use of embedded processors, in comparison to ad-hoc ASIC solutions, are manageable when carefully considered during the design of an embedded system for a particular application, or application domain. Finally, the methods presented in this Thesis do not restrict the reprogrammability of the embedded system.