Tampere University of Technology

TUTCRIS Research Portal

Reducing the overheads of hardware acceleration through datapath integration

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


Original languageEnglish
Title of host publicationMultimedia on Mobile Devices 2008, 28-29 January, 2008, San Jose, California, USA. Proceedings of SPIE-IS&T Electronic Imaging
EditorsR. Greutzbur, J. Takala
Pagespp. 68210R-1-10
Number of pages10
Publication statusPublished - 2008
Publication typeA4 Article in a conference publication
Duration: 1 Jan 1900 → …


Period1/01/00 → …


Hardware accelerators are used to speed up execution of specific tasks such as video coding. Often the purpose of hardware acceleration is to be able to use a cheaper or, for example, more energy economical processor for executing the majority of the application in software. However, when using hardware acceleration, new overheads are produced mainly due to the need to transfer data to and from the accelerator and signaling the readiness of the accelerator computation to the processor. We find the traditional mechanisms suboptimal for fine-grain hardware acceleration, especially when energy efficiency is important. This paper explores a technique unique to Transport Triggered Architectures to interface with hardware accelerators. The proposed technique places hardware accelerators to the processor data path, making them visible as regular function units to the programmer. This way communication costs are reduced as data can be transferred directly to the accelerator from other processor data path components and synchronization can be done by polling a simple ready flag in the accelerator function unit. Additionally, this setup enables the instruction scheduler of the compiler to schedule the hardware accelerator like any other operation, thus partially hide its latency with other program operations. The paper presents a case study with an audio decoder application in which fine-grain and coarse-grain hardware accelerators are integrated to the processor data path as function units. The case is used to study several different synchronization, communication, and latency-hiding techniques enabled by this kind of setup.

Publication forum classification

Downloads statistics

No data available