Tampere University of Technology

TUTCRIS Research Portal

Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval

Research output: Contribution to journalArticleScientificpeer-review

Standard

Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval. / Cao, Guanqun; Iosifidis, Alexandros; Chen, Ke; Gabbouj, Moncef.

In: IEEE Transactions on Cybernetics, Vol. 48, No. 9, 09.2018, p. 2542-2555.

Research output: Contribution to journalArticleScientificpeer-review

Harvard

Cao, G, Iosifidis, A, Chen, K & Gabbouj, M 2018, 'Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval', IEEE Transactions on Cybernetics, vol. 48, no. 9, pp. 2542-2555. https://doi.org/10.1109/TCYB.2017.2742705

APA

Cao, G., Iosifidis, A., Chen, K., & Gabbouj, M. (2018). Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval. IEEE Transactions on Cybernetics, 48(9), 2542-2555. https://doi.org/10.1109/TCYB.2017.2742705

Vancouver

Cao G, Iosifidis A, Chen K, Gabbouj M. Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval. IEEE Transactions on Cybernetics. 2018 Sep;48(9):2542-2555. https://doi.org/10.1109/TCYB.2017.2742705

Author

Cao, Guanqun ; Iosifidis, Alexandros ; Chen, Ke ; Gabbouj, Moncef. / Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval. In: IEEE Transactions on Cybernetics. 2018 ; Vol. 48, No. 9. pp. 2542-2555.

Bibtex - Download

@article{73f3c8fabad94794aac716b54b8bb780,
title = "Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval",
abstract = "In this paper, the problem of multi-view embed-ding from different visual cues and modalities is considered. We propose a unified solution for subspace learning methods using the Rayleigh quotient, which is extensible for multipleviews, supervised learning, and non-linear embeddings. Numerous methods including Canonical Correlation Analysis, Partial Least Square regression and Linear Discriminant Analysis are studied using specific intrinsic and penalty graphs within the same framework. Non-linear extensions based on kernels and(deep) neural networks are derived, achieving better performance than the linear ones. Moreover, a novel Multi-view Modular Discriminant Analysis (MvMDA) is proposed by taking the view difference into consideration. We demonstrate the effectiveness of the proposed multi-view embedding methods on visual objectrecognition and cross-modal image retrieval, and obtain superior results in both applications compared to related methods.",
author = "Guanqun Cao and Alexandros Iosifidis and Ke Chen and Moncef Gabbouj",
year = "2018",
month = "9",
doi = "10.1109/TCYB.2017.2742705",
language = "English",
volume = "48",
pages = "2542--2555",
journal = "IEEE Transactions on Cybernetics",
issn = "2168-2267",
publisher = "IEEE Advancing Technology for Humanity",
number = "9",

}

RIS (suitable for import to EndNote) - Download

TY - JOUR

T1 - Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval

AU - Cao, Guanqun

AU - Iosifidis, Alexandros

AU - Chen, Ke

AU - Gabbouj, Moncef

PY - 2018/9

Y1 - 2018/9

N2 - In this paper, the problem of multi-view embed-ding from different visual cues and modalities is considered. We propose a unified solution for subspace learning methods using the Rayleigh quotient, which is extensible for multipleviews, supervised learning, and non-linear embeddings. Numerous methods including Canonical Correlation Analysis, Partial Least Square regression and Linear Discriminant Analysis are studied using specific intrinsic and penalty graphs within the same framework. Non-linear extensions based on kernels and(deep) neural networks are derived, achieving better performance than the linear ones. Moreover, a novel Multi-view Modular Discriminant Analysis (MvMDA) is proposed by taking the view difference into consideration. We demonstrate the effectiveness of the proposed multi-view embedding methods on visual objectrecognition and cross-modal image retrieval, and obtain superior results in both applications compared to related methods.

AB - In this paper, the problem of multi-view embed-ding from different visual cues and modalities is considered. We propose a unified solution for subspace learning methods using the Rayleigh quotient, which is extensible for multipleviews, supervised learning, and non-linear embeddings. Numerous methods including Canonical Correlation Analysis, Partial Least Square regression and Linear Discriminant Analysis are studied using specific intrinsic and penalty graphs within the same framework. Non-linear extensions based on kernels and(deep) neural networks are derived, achieving better performance than the linear ones. Moreover, a novel Multi-view Modular Discriminant Analysis (MvMDA) is proposed by taking the view difference into consideration. We demonstrate the effectiveness of the proposed multi-view embedding methods on visual objectrecognition and cross-modal image retrieval, and obtain superior results in both applications compared to related methods.

U2 - 10.1109/TCYB.2017.2742705

DO - 10.1109/TCYB.2017.2742705

M3 - Article

VL - 48

SP - 2542

EP - 2555

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

SN - 2168-2267

IS - 9

ER -