Tampere University of Technology

TUTCRIS Research Portal

Robust signal processing methods for genomic time series and protein accessibility data

Research output: Collection of articlesDoctoral Thesis

Details

Original languageEnglish
Place of PublicationTampere
PublisherTampere University of Technology
Number of pages84
ISBN (Electronic)978-952-15-1911-6
ISBN (Print)978-952-15-1872-0
StatePublished - 28 Nov 2008
Publication typeG5 Doctoral dissertation (article)

Publication series

NameTampere University of Technology. Publication
PublisherTampere University of Technology
Volume692
ISSN (Print)1459-2045

Abstract

The aim of systems biology is to study living beings at the system level. This means that instead of studying just single molecules, we also try to understand the dynamics of larger systems such as biochemical and gene regulatory networks. By entering the genome and proteome wide level we are faced with great opportunities but also challenges. The introduction of high-throughput measurement technologies for cellular level studies during the last decade has made it necessary to use advanced signal processing methods in computational systems biology and bioinformatics. The gene activity and protein level measurement technologies available today produce huge amounts of data that cannot be processed manually. Thus, advanced computational methods for analysing the data and making conclusions are essential. The aim of this thesis is to introduce efficient signal processing methods that can be used in making relevant decisions based on systems biological measurement data. The thesis has been divided into three logical parts. In the first part of the thesis, gene expression microarray measurements are studied. These measurements provide one of the main type of data used in the analyses later in the thesis. A simulation model is then introduced for the generation of microarray data with realistic statistical and biological properties. This data can be used e.g. in the generation of ground truth data for simulation studies. In the second part, time series signals measured from genes with microarrays are studied. Periodicity detection analysis of gene microarray data is especially difficult due to short time series length, the vast number of measured genes and unknown type of noise in the measurements. We introduce different robust methods for both uniformly and nonuniformly sampled time series. The introduced methods are shown to be insensitive to changes in the assumed statistical model for the data and thus improve on robustness if compared to classical methods. Finally, in the third part we move from genomic data to the actual end products of genes, proteins. A method is presented that can discern locations in the protein sequence that are more prone to pathogenic mutations on average than other locations in the sequence. The data we use is measured from clinical patients and depict the hydropathy of different parts of the sequence. Changes in the hydropathy of a protein have been shown to relate to structural and functional changes and thus provide an interesting field of study.

Open access publication

Country of publishing

Publication forum classification