TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Hardware Deceleration of Kvazaar HEVC Encoder

Tutkimustuotosvertaisarvioitu

Yksityiskohdat

AlkuperäiskieliEnglanti
OtsikkoEmbedded Computer Systems
AlaotsikkoArchitectures, Modeling, and Simulation - 19th International Conference, SAMOS 2019, Proceedings
KustantajaSpringer
Sivut311-324
Sivumäärä14
ISBN (painettu)9783030275617
DOI - pysyväislinkit
TilaJulkaistu - 4 lokakuuta 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaInternational Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation - Samos, Kreikka
Kesto: 7 heinäkuuta 201911 heinäkuuta 2019

Julkaisusarja

NimiLecture Notes in Computer Science
Vuosikerta11733
ISSN (painettu)0302-9743
ISSN (elektroninen)1611-3349

Conference

ConferenceInternational Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation
MaaKreikka
KaupunkiSamos
Ajanjakso7/07/1911/07/19

Tiivistelmä

High Efficiency Video Coding (HEVC) doubles the coding efficiency
of the prior Advanced Video Coding (AVC) standard but tackling its huge com-
plexity calls for efficient HEVC codec implementations. The recent advances in
Graphics Processing Units (GPUs) have made programmable general-purpose
GPUs (GPGPUs) a popular option for accelerating various video coding tools.
Massively parallel GPU architectures are particularly well suited for hardware-
oriented full search (FS) algorithm in HEVC integer motion estimation (IME).
This paper analyzes the feasibility of a GPU-accelerated FS implementation in
the practical Kvazaar open-source HEVC encoder. According to our evaluations,
implementing FS on AMD Radeon RX 480 GPU makes Kvazaar 12.5 times as
fast as the respective anchor implemented entirely on an Intel 8-core i7 processor.
However, the obtained speed gain is lost when fast IME algorithms are put into
use in the anchor. For example, executing the anchor with hexagon-based search
(HEXBS) algorithm is almost two times as fast as our GPU-accelerated proposal
and the benefit of GPU offloading is reduced to a slight coding gain of 1.2%. Our
results show that accelerating IME on a GPU speeds up non-practical encoders
due to their enormous inherent complexity but the price paid with practical en-
coders tends to be too high. Conditional processing schemes of fast IME algo-
rithms can be efficiently executed on processors without any substantial coding
loss over that of FS. Nevertheless, we still believe there might be room for ex-
ploiting GPU on IME acceleration but GPU-parallelized fast algorithms are
needed to get value for additional implementation cost and power budget.

Julkaisufoorumi-taso