TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Application-Specific Parallel Structures for Discrete Cosine Transform and Variable Length Decoding

Tutkimustuotos

Yksityiskohdat

AlkuperäiskieliEnglanti
JulkaisupaikkaTampere
KustantajaTampere University of Technology
Sivumäärä127
ISBN (elektroninen)952-15-1405-1
ISBN (painettu)952-15-1196-6
TilaJulkaistu - 18 kesäkuuta 2004
OKM-julkaisutyyppiG5 Artikkeliväitöskirja

Julkaisusarja

NimiTampere University of Technology. Publication
KustantajaTampere University of Technology
Vuosikerta481
ISSN (painettu)1459-2045

Tiivistelmä

This Thesis considers the design of application-specific parallel structures for digital signal processing. Due to wideness of the subject, the discussion has been restricted to the studies of the discrete cosine transform and variable length decoding. New area-efficient parallel structures, which process data in a sequential form at data rate, are developed for the discrete cosine transform. The development of the structures begins with the derivation of novel regular fast algorithms. The algorithms lend themselves for vertical mapping resulting in modular cascaded structures that can be freely pipelined due to the loop-free structure. In order to prove the feasibility and estimate the performance, the unified transform kernel for discrete cosine transform and its inverse is implemented on a standard cell CMOS technology with a data path synthesis. Finally, the comparison to a state-of-the-art design reveals up to 15% smaller estimated area than in the reference design. For the variable length decoding, a novel multiple-symbol decoding scheme is proposed. The critical path of the resulting decoder is minimized by introducing a new multiplexed add unit. In order to prove the feasibility and determine the limiting factors of the scheme, the decoder has been implemented on an FPGA technology. When applied to MPEG-2 standard benchmark scenes, on average 4.8 codewords are decoded per cycle resulting in the throughput of 106 million symbols per second. Although, a straightforward and fair comparison of variable length decoders is extremely difficult due to different implementation approaches, the performance of the decoder can be considered promising with 16-100% better throughput at 2-3.6 times lower frequencies than the reference designs on the same FPGA technology. In both the case studies, the discrete cosine transform and variable length decoding, the modularity and achievable high speed operation provide flexibility for the design re-use in the current and future applications.

Julkaisufoorumi-taso

Latausten tilastot

Ei tietoja saatavilla