Tampere University of Technology

TUTCRIS Research Portal

Feature Dimensionality Reduction with Graph Embedding and Generalized Hamming Distance

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


Original languageEnglish
Title of host publication2018 25th IEEE International Conference on Image Processing (ICIP)
Number of pages5
ISBN (Electronic)978-1-4799-7061-2
ISBN (Print)978-1-4799-7062-9
Publication statusPublished - Oct 2018
Publication typeA4 Article in a conference publication
EventIEEE International Conference on Image Processing -
Duration: 1 Jan 1900 → …

Publication series

ISSN (Electronic)2381-8549


ConferenceIEEE International Conference on Image Processing
Period1/01/00 → …


Principal component analysis (PCA) and linear discriminant analysis (LDA) are the most well-known methods to reduce the dimensionality of feature vectors. However, both methods face challenges when used on multilabel data - each data point may be associated to multiple labels. PCA does not take advantage of label information thus the performance is sacrificed. LDA can exploit class information for multiclass data, but cannot be directly applied to multilabel problems. In this paper, we propose a novel dimensionality reduction method for multilabel data. We first introduce the generalized Hamming distance that measures the distance of two data points in the label space. Then the proposed distance is used in the graph embedding framework for feature dimension reduction. We verified the proposed method using three multilabel benchmark datasets and one large image dataset. The results show that the proposed feature dimensionality reduction method consistently outperforms PCA and other competing methods.


  • feature extraction, graph theory, learning (artificial intelligence), pattern classification, principal component analysis, vectors, competing methods, feature dimensionality reduction method, multilabel benchmark datasets, graph embedding framework, label space, generalized hamming distance, multilabel problems, multiclass data, class information, label information, PCA, data point, multilabel data, feature vectors, linear discriminant analysis, Dimensionality reduction, Principal component analysis, Hamming distance, Mutual information, Measurement, Dogs, Linear programming, dimensionality reduction, graph embedding, multilabel

Publication forum classification

Field of science, Statistics Finland