TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Binarized Convolutional Neural Networks for Efficient Inference on GPUs

Tutkimustuotosvertaisarvioitu

Yksityiskohdat

AlkuperäiskieliEnglanti
Otsikko2018 26th European Signal Processing Conference (EUSIPCO)
KustantajaIEEE
Sivut682-686
ISBN (elektroninen)978-9-0827-9701-5
ISBN (painettu)978-1-5386-3736-4
DOI - pysyväislinkit
TilaJulkaistu - syyskuuta 2018
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaEUROPEAN SIGNAL PROCESSING CONFERENCE -
Kesto: 1 tammikuuta 1900 → …

Julkaisusarja

Nimi
ISSN (elektroninen)2076-1465

Conference

ConferenceEUROPEAN SIGNAL PROCESSING CONFERENCE
Ajanjakso1/01/00 → …

Tiivistelmä

Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive, which can make their feasible implementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7.4× with only 4.4 % loss in accuracy compared to a reference implementation.

Julkaisufoorumi-taso