Learning from Real-World Data Challenges for Similarity Search
Author | : Sarah Ibrahimi |
Promotor(s) | : Prof.dr. M. Worring / Prof.dr. Z.J.M.H Geradts |
University | : University of Amsterdam |
: 2024 | |
Link to repository | : Link to thesis |
Abstract
This thesis investigates how to perform robust and reliable similarity search in real-world settings with data complexities and variations. Models should adapt constantly to these circumstances and be capable of handling the nuances of real-world data. Throughout this thesis, we focus on the research question: How can we design similarity search methods that can adapt to the complexities and variations present in real-world data? We explore this question through multiple facets of real-world data. We start with similarity search in the image domain, tackling the real-world complexity of noisy labels, then investigate the real-world data variation of image data across domains. Building upon these concepts, we study a specific image similarity search setup that addresses both noisy labels and cross-domain data simultaneously. We then introduce another real-world data variation, the challenge of dealing with multiple modalities. We explore how manifold selection can enhance similarity search in multimodal data. Finally, we investigate how exploiting additional modalities can provide complementary information to improve similarity search performance.