Tampere University of Technology

TUTCRIS Research Portal

Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver

Research output: Contribution to journalArticleScientificpeer-review

Standard

Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver. / Nouri, Sajjad; Hussain, Waqar; Nurmi, Jari.

In: IEEE Transactions on Parallel and Distributed Systems, 01.11.2017, p. 3171.

Research output: Contribution to journalArticleScientificpeer-review

Harvard

Nouri, S, Hussain, W & Nurmi, J 2017, 'Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver', IEEE Transactions on Parallel and Distributed Systems, pp. 3171. https://doi.org/10.1109/TPDS.2017.2706691

APA

Vancouver

Nouri S, Hussain W, Nurmi J. Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver. IEEE Transactions on Parallel and Distributed Systems. 2017 Nov 1;3171. https://doi.org/10.1109/TPDS.2017.2706691

Author

Nouri, Sajjad ; Hussain, Waqar ; Nurmi, Jari. / Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver. In: IEEE Transactions on Parallel and Distributed Systems. 2017 ; pp. 3171.

Bibtex - Download

@article{59d49ba6e81440e6ad8c528aa3ae404a,
title = "Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver",
abstract = "This paper presents an evaluation of a Heterogeneous Multicore Architecture (HMA) by implementing Orthogonal Frequency- Division Multiplexing (OFDM) receiver blocks as designs for the test of functionality. OFDM receiver consists of computationally intensive and general-purpose processing tasks that can provide maximum coverage to test and evaluate a massively-parallel as well as a general-purpose platform like the HMA. The blocks of the receiver are primarily designed by crafting template-based Coarse-Grained Reconfigurable Array (CGRA) devices and then arranging them in a sequence over a Network-on-Chip (NoC) structure along with a few RISC cores for complete OFDM processing. The OFDM blocks such as Fast Fourier Transform (FFT) and Time Synchronization are computationally intensive and require parallel processing. The OFDM receiver also contains tasks such as frequency offset estimation which require the processing of Taylor series and CORDIC algorithms that are serial in nature. Such a combination of serial and parallel algorithms can perform a thorough exploration and evaluation of almost all the design features of an HMA. The OFDM implementation has led to scale CGRAs to different dimensions, instantiate Processing Elements (PEs) as multiple arithmetic resources and to establish almost all possible ways of PE interconnections. It further explores time-multiplexed patterns for data placement in the CGRA memories. Nevertheless, the data can also be exchanged among different nodes over NoC structure simultaneously and independently by using direct memory access devices. In this experimental work, the performance of each CGRA, the collective performance of the whole platform and the NoC traffic are recorded in terms of the number of clock cycles and several high-level performance metrics. Today’s HMAs are generally over or under resourced for the applications that they are designed for and thus not an optimal choice for the end user. Apart from the interesting comparisons to the other state-of-the-art, our experimental setup has provided important insight and guidelines that the designers can use to implement near-optimal solutions for their target applications.",
author = "Sajjad Nouri and Waqar Hussain and Jari Nurmi",
year = "2017",
month = "11",
day = "1",
doi = "10.1109/TPDS.2017.2706691",
language = "English",
pages = "3171",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "Institute of Electrical and Electronics Engineers",

}

RIS (suitable for import to EndNote) - Download

TY - JOUR

T1 - Evaluation of a Heterogeneous Multicore Architecture by Design and Test of an OFDM Receiver

AU - Nouri, Sajjad

AU - Hussain, Waqar

AU - Nurmi, Jari

PY - 2017/11/1

Y1 - 2017/11/1

N2 - This paper presents an evaluation of a Heterogeneous Multicore Architecture (HMA) by implementing Orthogonal Frequency- Division Multiplexing (OFDM) receiver blocks as designs for the test of functionality. OFDM receiver consists of computationally intensive and general-purpose processing tasks that can provide maximum coverage to test and evaluate a massively-parallel as well as a general-purpose platform like the HMA. The blocks of the receiver are primarily designed by crafting template-based Coarse-Grained Reconfigurable Array (CGRA) devices and then arranging them in a sequence over a Network-on-Chip (NoC) structure along with a few RISC cores for complete OFDM processing. The OFDM blocks such as Fast Fourier Transform (FFT) and Time Synchronization are computationally intensive and require parallel processing. The OFDM receiver also contains tasks such as frequency offset estimation which require the processing of Taylor series and CORDIC algorithms that are serial in nature. Such a combination of serial and parallel algorithms can perform a thorough exploration and evaluation of almost all the design features of an HMA. The OFDM implementation has led to scale CGRAs to different dimensions, instantiate Processing Elements (PEs) as multiple arithmetic resources and to establish almost all possible ways of PE interconnections. It further explores time-multiplexed patterns for data placement in the CGRA memories. Nevertheless, the data can also be exchanged among different nodes over NoC structure simultaneously and independently by using direct memory access devices. In this experimental work, the performance of each CGRA, the collective performance of the whole platform and the NoC traffic are recorded in terms of the number of clock cycles and several high-level performance metrics. Today’s HMAs are generally over or under resourced for the applications that they are designed for and thus not an optimal choice for the end user. Apart from the interesting comparisons to the other state-of-the-art, our experimental setup has provided important insight and guidelines that the designers can use to implement near-optimal solutions for their target applications.

AB - This paper presents an evaluation of a Heterogeneous Multicore Architecture (HMA) by implementing Orthogonal Frequency- Division Multiplexing (OFDM) receiver blocks as designs for the test of functionality. OFDM receiver consists of computationally intensive and general-purpose processing tasks that can provide maximum coverage to test and evaluate a massively-parallel as well as a general-purpose platform like the HMA. The blocks of the receiver are primarily designed by crafting template-based Coarse-Grained Reconfigurable Array (CGRA) devices and then arranging them in a sequence over a Network-on-Chip (NoC) structure along with a few RISC cores for complete OFDM processing. The OFDM blocks such as Fast Fourier Transform (FFT) and Time Synchronization are computationally intensive and require parallel processing. The OFDM receiver also contains tasks such as frequency offset estimation which require the processing of Taylor series and CORDIC algorithms that are serial in nature. Such a combination of serial and parallel algorithms can perform a thorough exploration and evaluation of almost all the design features of an HMA. The OFDM implementation has led to scale CGRAs to different dimensions, instantiate Processing Elements (PEs) as multiple arithmetic resources and to establish almost all possible ways of PE interconnections. It further explores time-multiplexed patterns for data placement in the CGRA memories. Nevertheless, the data can also be exchanged among different nodes over NoC structure simultaneously and independently by using direct memory access devices. In this experimental work, the performance of each CGRA, the collective performance of the whole platform and the NoC traffic are recorded in terms of the number of clock cycles and several high-level performance metrics. Today’s HMAs are generally over or under resourced for the applications that they are designed for and thus not an optimal choice for the end user. Apart from the interesting comparisons to the other state-of-the-art, our experimental setup has provided important insight and guidelines that the designers can use to implement near-optimal solutions for their target applications.

U2 - 10.1109/TPDS.2017.2706691

DO - 10.1109/TPDS.2017.2706691

M3 - Article

SP - 3171

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

ER -