Doodle to Search
Practical Zero-Shot Sketch-based Image Retrieval

Sounak Dey*, Pau Riba*, Anjan Dutta, Josep Lladós

Computer Vision Center, UAB

{sdey, priba, adutta, josep}@cvc.uab.cat

Yi-Zhe Song

SketchX, CVSSP, University of Surrey

y.song@surrey.ac.uk



Abstract

In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset.

Poster

Download Paper

BibTeX




    @InProceedings{Dey_2019_CVPR,
        author={Dey, Sounak and Riba, Pau and Dutta, Anjan and Lladós, Josep and Song, Yi-Zhe},
        title={Doodle To Search: Practical Zero-Shot Sketch-based Image Retrieval},
        journal={The IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
        month = {June},
        year={2019}
    }

QuickDraw-Extended Dataset

This dataset took advantage of the Google Quick, Draw! data which is a huge collection of drawings (50 millions) belonging to 345 categories obtained from the Quick, Draw! game. We propose to make use of a subset of sketches to construct a novel dataset for large-scale ZS-SBIR containing 110 categories (80 for training and 30 for testing). As a retrieval gallery, we provide images extracted from Flickr tagged with the corresponding label. Finally, this dataset consists of 330,000 sketches and 203,885 photos moving towards a large-scale retrieval. We consider that this dataset will provide better insights about the real performance of ZS-SBIR in a real scenario.




Sketches!

Query Sketches

Photos!

Photo Gallery

Slides

Video presentation

Code

Code to reproduce the experiments

Supplementary material

Paper supplementary

Creative Commons License

License

Non-commercial research purpose only.