Learning to Classify Fine-Grained Categories with Privileged Visual-Semantic Misalignment
Research output: Contribution to journal › Article › Scientific › peer-review
|Journal||IEEE Transactions on Big Data|
|Publication status||Published - 2016|
|Publication type||A1 Journal article-refereed|
Image categorisation is an active yet challenging research topic in computer vision, which is to classify the images according to their semantic content. Recently, fine-grained object categorisation has attracted wide attention and remains difficult due to feature inconsistency caused by smaller inter-class and larger intra-class variation as well as large varying poses. Most of the existing frameworks focused on exploiting a more discriminative imagery representation or developing a more robust classification framework to mitigate the suffering. The concern has recently been paid to discovering the dependency across fine-grained class labels based on Convolutional Neural Networks. Encouraged by the success of semantic label embedding to discover the fine-grained class labels’ correlation, this paper exploits the misalignment between visual feature space and semantic label embedding space and incorporates it as a privileged information into a cost-sensitive learning framework. Owing to capturing both the variation of imagery feature representation and also the label correlation in the semantic label embedding space, such a visual-semantic misalignment can be employed to reflect the importance of instances, which is more informative that conventional cost-sensitivities. Experiment results demonstrate the effectiveness of the proposed framework on public fine-grained benchmarks with achieving superior performance to state-of-the-arts.