TUTCRIS - Tampereen teknillinen yliopisto


Learning to Classify Fine-Grained Categories with Privileged Visual-Semantic Misalignment



JulkaisuIEEE Transactions on Big Data
DOI - pysyväislinkit
TilaJulkaistu - 2016
OKM-julkaisutyyppiA1 Alkuperäisartikkeli


Image categorisation is an active yet challenging research topic in computer vision, which is to classify the images according to their semantic content. Recently, fine-grained object categorisation has attracted wide attention and remains difficult due to feature inconsistency caused by smaller inter-class and larger intra-class variation as well as large varying poses. Most of the existing frameworks focused on exploiting a more discriminative imagery representation or developing a more robust classification framework to mitigate the suffering. The concern has recently been paid to discovering the dependency across fine-grained class labels based on Convolutional Neural Networks. Encouraged by the success of semantic label embedding to discover the fine-grained class labels’ correlation, this paper exploits the misalignment between visual feature space and semantic label embedding space and incorporates it as a privileged information into a cost-sensitive learning framework. Owing to capturing both the variation of imagery feature representation and also the label correlation in the semantic label embedding space, such a visual-semantic misalignment can be employed to reflect the importance of instances, which is more informative that conventional cost-sensitivities. Experiment results demonstrate the effectiveness of the proposed framework on public fine-grained benchmarks with achieving superior performance to state-of-the-arts.