Tampere University of Technology

TUTCRIS Research Portal

Using OpenCL to Rapidly Prototype FPGA Designs

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Standard

Using OpenCL to Rapidly Prototype FPGA Designs. / Wang, Kui; Nurmi, Jari.

2016 IEEE Nordic Circuits and Systems Conference (NORCAS). IEEE, 2016.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Harvard

Wang, K & Nurmi, J 2016, Using OpenCL to Rapidly Prototype FPGA Designs. in 2016 IEEE Nordic Circuits and Systems Conference (NORCAS). IEEE, Nordic circuits and systems conference, 1/01/00. https://doi.org/10.1109/NORCHIP.2016.7792907

APA

Wang, K., & Nurmi, J. (2016). Using OpenCL to Rapidly Prototype FPGA Designs. In 2016 IEEE Nordic Circuits and Systems Conference (NORCAS) IEEE. https://doi.org/10.1109/NORCHIP.2016.7792907

Vancouver

Wang K, Nurmi J. Using OpenCL to Rapidly Prototype FPGA Designs. In 2016 IEEE Nordic Circuits and Systems Conference (NORCAS). IEEE. 2016 https://doi.org/10.1109/NORCHIP.2016.7792907

Author

Wang, Kui ; Nurmi, Jari. / Using OpenCL to Rapidly Prototype FPGA Designs. 2016 IEEE Nordic Circuits and Systems Conference (NORCAS). IEEE, 2016.

Bibtex - Download

@inproceedings{aafc58b92fdc473f8a9fd4975a28b39b,
title = "Using OpenCL to Rapidly Prototype FPGA Designs",
abstract = "Field Programmable Gate Arrays (FPGAs) have gained popularity because their reconfigurability can speed up development and verification with relatively low cost. However the deep level of understanding required on hardware logic programming has discouraged many software engineers. An interface between host devices and FPGAs to enable designing and programming FPGAs using a software programming standard and encapsulating hardware details is much desired. In this paper we evaluate leveraging Open Computing Language (OpenCL) to rapidly design FPGAs, considering both hardware logic utilization efficiency and computing performance. On a heterogeneous computer system consisting of ARM processors and Altera FPGA, we execute an OpenCL host program on the ARM processors and an OpenCL kernel on the FPGA, to compute a parametrizable two-dimensional Mandelbrot fractal. We explore three design aspects of adjusting OpenCL work-group size, coalescing memory access, and replicating compute units to improve the FPGA computation performance. After optimizing the core algorithm, we efficiently reduced the logic utilization and Digital Signal Processing (DSP) blocks required for a single compute unit, and successfully increased the number of replicated compute units from four to six, thus delivering a 1.5X increase of parallel computation capacity of the FPGA, and improving the computing speed by 1.5X and memory bandwidth by 1.7X.",
author = "Kui Wang and Jari Nurmi",
year = "2016",
month = "11",
day = "1",
doi = "10.1109/NORCHIP.2016.7792907",
language = "English",
booktitle = "2016 IEEE Nordic Circuits and Systems Conference (NORCAS)",
publisher = "IEEE",

}

RIS (suitable for import to EndNote) - Download

TY - GEN

T1 - Using OpenCL to Rapidly Prototype FPGA Designs

AU - Wang, Kui

AU - Nurmi, Jari

PY - 2016/11/1

Y1 - 2016/11/1

N2 - Field Programmable Gate Arrays (FPGAs) have gained popularity because their reconfigurability can speed up development and verification with relatively low cost. However the deep level of understanding required on hardware logic programming has discouraged many software engineers. An interface between host devices and FPGAs to enable designing and programming FPGAs using a software programming standard and encapsulating hardware details is much desired. In this paper we evaluate leveraging Open Computing Language (OpenCL) to rapidly design FPGAs, considering both hardware logic utilization efficiency and computing performance. On a heterogeneous computer system consisting of ARM processors and Altera FPGA, we execute an OpenCL host program on the ARM processors and an OpenCL kernel on the FPGA, to compute a parametrizable two-dimensional Mandelbrot fractal. We explore three design aspects of adjusting OpenCL work-group size, coalescing memory access, and replicating compute units to improve the FPGA computation performance. After optimizing the core algorithm, we efficiently reduced the logic utilization and Digital Signal Processing (DSP) blocks required for a single compute unit, and successfully increased the number of replicated compute units from four to six, thus delivering a 1.5X increase of parallel computation capacity of the FPGA, and improving the computing speed by 1.5X and memory bandwidth by 1.7X.

AB - Field Programmable Gate Arrays (FPGAs) have gained popularity because their reconfigurability can speed up development and verification with relatively low cost. However the deep level of understanding required on hardware logic programming has discouraged many software engineers. An interface between host devices and FPGAs to enable designing and programming FPGAs using a software programming standard and encapsulating hardware details is much desired. In this paper we evaluate leveraging Open Computing Language (OpenCL) to rapidly design FPGAs, considering both hardware logic utilization efficiency and computing performance. On a heterogeneous computer system consisting of ARM processors and Altera FPGA, we execute an OpenCL host program on the ARM processors and an OpenCL kernel on the FPGA, to compute a parametrizable two-dimensional Mandelbrot fractal. We explore three design aspects of adjusting OpenCL work-group size, coalescing memory access, and replicating compute units to improve the FPGA computation performance. After optimizing the core algorithm, we efficiently reduced the logic utilization and Digital Signal Processing (DSP) blocks required for a single compute unit, and successfully increased the number of replicated compute units from four to six, thus delivering a 1.5X increase of parallel computation capacity of the FPGA, and improving the computing speed by 1.5X and memory bandwidth by 1.7X.

U2 - 10.1109/NORCHIP.2016.7792907

DO - 10.1109/NORCHIP.2016.7792907

M3 - Conference contribution

BT - 2016 IEEE Nordic Circuits and Systems Conference (NORCAS)

PB - IEEE

ER -