Tampere University of Technology

TUTCRIS Research Portal

Towards Algebraic Modeling of GPU Memory Access for Bank Conflict Mitigation

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Details

Original languageEnglish
Title of host publication2019 IEEE International Workshop on Signal Processing Systems, SiPS 2019
PublisherIEEE
Pages103-108
Number of pages6
ISBN (Electronic)9781728119274
DOIs
Publication statusPublished - 1 Oct 2019
Publication typeA4 Article in a conference publication
EventIEEE International Workshop on Signal Processing Systems - Nanjing, China
Duration: 20 Oct 201923 Oct 2019

Conference

ConferenceIEEE International Workshop on Signal Processing Systems
Abbreviated titleSiPS
CountryChina
CityNanjing
Period20/10/1923/10/19

Abstract

Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-Trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.

Keywords

  • block matching, Graphics processing units, memory hierarchy, OpenCL

Publication forum classification

Field of science, Statistics Finland