Tampere University of Technology

TUTCRIS Research Portal

Hardware Deceleration of Kvazaar HEVC Encoder

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Details

Original languageEnglish
Title of host publicationEmbedded Computer Systems
Subtitle of host publicationArchitectures, Modeling, and Simulation - 19th International Conference, SAMOS 2019, Proceedings
PublisherSpringer
Pages311-324
Number of pages14
ISBN (Print)9783030275617
DOIs
Publication statusPublished - 4 Oct 2019
Publication typeA4 Article in a conference publication
EventInternational Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation - Samos, Greece
Duration: 7 Jul 201911 Jul 2019

Publication series

NameLecture Notes in Computer Science
Volume11733
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation
CountryGreece
CitySamos
Period7/07/1911/07/19

Abstract

High Efficiency Video Coding (HEVC) doubles the coding efficiency
of the prior Advanced Video Coding (AVC) standard but tackling its huge com-
plexity calls for efficient HEVC codec implementations. The recent advances in
Graphics Processing Units (GPUs) have made programmable general-purpose
GPUs (GPGPUs) a popular option for accelerating various video coding tools.
Massively parallel GPU architectures are particularly well suited for hardware-
oriented full search (FS) algorithm in HEVC integer motion estimation (IME).
This paper analyzes the feasibility of a GPU-accelerated FS implementation in
the practical Kvazaar open-source HEVC encoder. According to our evaluations,
implementing FS on AMD Radeon RX 480 GPU makes Kvazaar 12.5 times as
fast as the respective anchor implemented entirely on an Intel 8-core i7 processor.
However, the obtained speed gain is lost when fast IME algorithms are put into
use in the anchor. For example, executing the anchor with hexagon-based search
(HEXBS) algorithm is almost two times as fast as our GPU-accelerated proposal
and the benefit of GPU offloading is reduced to a slight coding gain of 1.2%. Our
results show that accelerating IME on a GPU speeds up non-practical encoders
due to their enormous inherent complexity but the price paid with practical en-
coders tends to be too high. Conditional processing schemes of fast IME algo-
rithms can be efficiently executed on processors without any substantial coding
loss over that of FS. Nevertheless, we still believe there might be room for ex-
ploiting GPU on IME acceleration but GPU-parallelized fast algorithms are
needed to get value for additional implementation cost and power budget.

Publication forum classification

Field of science, Statistics Finland