A novel strategy to balance the results of cross-modal hashing

Zhong, Fangming; Chen, Zhikui; Min, Geyong; Xia, Feng

Title: A novel strategy to balance the results of cross-modal hashing
Creator: Zhong, Fangming; Chen, Zhikui; Min, Geyong; Xia, Feng
Date: 2020
Type: Text; Journal article
Identifier: http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/184698
Identifier: vital:16561
Identifier: https://doi.org/10.1016/j.patcog.2020.107523
Identifier: ISBN:0031-3203 (ISSN)
Abstract: Hashing methods for cross-modal retrieval has drawn increasing research interests and has been widely studied in recent years due to the explosive growth of multimedia big data. However, a significant phenomenon which has been ignored is that there is a large gap between the results of cross-modal hashing in most cases. For example, the results of Text-to-Image frequently outperform that of Image-to-Text with a large margin. In this paper, we propose a strategy named semantic augmentation to improve and balance the results of cross-modal hashing. An intermediate semantic space is constructed to re-align the feature representations that embedded with weak semantic information. By using the intermediate semantic space, the semantic information of visual features can be further augmented before being sent to cross-modal hashing algorithms. Extensive experiments are carried out on four datasets via seven state-of-the-art cross-modal hashing methods. Compared against the results without semantic augmentation, the Image-to-Text results of these methods with semantic augmentation are improved considerably, which demonstrates the effectiveness of the proposed semantic augmentation strategy in bridging the gap between the results of cross-modal retrieval. Additional experiments are conducted on the real-valued, semi-supervised, semi-paired, partial-paired, and unpaired cross-modal retrieval methods, the results further indicates the effectiveness of our strategy in improving performance of cross-modal retrieval. © 2020 Elsevier Ltd
Publisher: Elsevier Ltd
Relation: Pattern Recognition Vol. 107, no. (2020), p.
Rights: All metadata describing materials held in, or linked to, the repository is freely available under a CC0 licence
Subject: 4603 Computer Vision and Multimedia Computation; 4605 Data Management and Data Science; 4611 Machine Learning; Cross-modal hashing; Cross-modal retrieval; Semantic augmentation; Semantic gap
Reviewed
Funder: This work was supported in part by the National Key Research and Development Program of China [grant numbers 2018YFC0831305].

Hits: 462
Visitors: 433
Downloads: 0

		Thumbnail	File	Description	Size	Format