Active Re-ranking For Web Image Search
A B S T R A C T (See below for source code)
Image search re ranking methods usually fail to capture the user’s intention when the query term is ambiguous. Therefore, re ranking with user interactions, or active re ranking, is highly demanded to effectively improve the search performance. The essential problem in active re ranking is how to target the user’s intention. To complete this goal, this paper presents a structural information based sample selection strategy to reduce the user’s labeling efforts.
Furthermore, to localize the user’s intention in the visual feature space, a novel local-global discriminative dimension reduction algorithm is proposed. In this algorithm, a sub manifold is learned by transferring the local geometry and the discriminative information from the labelled images to the whole (global) image database. Experiments on both synthetic data-sets and a real Web image search data-set demonstrate the effectiveness of the proposed active re ranking scheme, including both the structural information based active sample selection strategy and the local-global discriminative dimension reduction algorithm.
High-dimensional vector-based learning, such as face recognition and digit image classification, has benefited from dimensionality reduction techniques for the low computational complexity and the informative representation of the data structure in the low-dimensional space . According to methodologies of different dimensionality reduction algorithms used, they can be mainly classified into three categories.
They usually make some assumptions on the data point cloud to preserve the local structures. However, the extension to new test data is limited. These three types of methods also have relationships. As we know, linear method can be kernelized if it only involves inner-product. Some direct non-linear embedding methods are also linearized, then they can be kernelized again (He et al., 2005b,a). The main advantages of the linearlization are
For instance, the linear method LPP (He et al.,2005b) linearized the manifold-based method Laplacian Eigenmaps (Belkin and Niyogi, 2003). Thus, LPP also has the locality preserving property in the Euclidean space and achieves good results. In general, the above linearlization methods are all unsupervised, which means they cannot use any label information.
Recently, some supervised dimensionality reduction algorithms brought the idea from the mentioned unsupervised methods. They start from the local structure of data and preserve the geometric information provided by both the data point cloud and the label information (Cai et al., 2007; Chen et al., 2005; Nie et al., 2007; Sugiyama, 2006; Yan et al., 2007). Sometimes, we cannot expect that there are sufficient labeled data samples to realize dimensionality reduction for classification task, since labeling work is both time consuming and costly.
Conversely, we can obtain many unlabeled data in the real world. Thus, semi-supervised dimensionality reduction, using some labeled data and lots of unlabeled data to find a low-dimensional representation, is of great practical importance (Zhu, 2005). In this paper, we propose a novel semi-supervised manifold discriminant analysis-based dimensionality reduction algorithm. We start from local and embed the data in global coordinate, which can separate each sub-manifold constructed by one class.
Based on this idea, we define the within-manifold, between-manifold and total manifold scatter matrices. The former two are supervised and the last is unsupervised constructed by labeled and unlabeled data. Semi-supervised dimensionality reduction is realized by both maximizing the between-manifold scatter and minimizing the within manifold scatter, simultaneously preserving the local structure of a manifold and the intrinsic geometry of both labeled and unlabeled data.
It is interesting of our method from the following perspectives:
We have three types of images: labelled relevant, labeled irrelevant, and unlabelled. Therefore, we build 3 types of patches, which are:
NON FUNCTIONAL REQUIREMENTS
The major non-functional Requirements of the system are as follows
Label information Collection
To collect the labeling information from users efficiently, a new structural information based strategy is proposed to actively select the most informative query images. It is boring and unacceptable to keep asking a user to label a lot of images in the interaction stage. Thus, it is essential to get the necessary information by labeling as few images as possible.
Active learning is well-known for reducing the labeling efforts, by labeling most informative samples. Conventional active learning strategies can be divided into two categories: the error reduction strategy and the most uncertain (close-to-boundary) strategy. Both of them suffer from the small sample size problem, i.e., the unreliable estimation of the expected error risk and the uncertainty caused by the insufficient labelled samples.
Visual characteristic localization
To localize the visual characteristics of the user’s intention, we propose a novel local-global discriminative (LGD) dimension reduction algorithm. Basically, we assume that the query relevant images, which represent the user’s intention, are lying on a low-dimensional sub manifold of the original ambient (visual feature) space. LGD learns this sub manifold by transferring both the local geometry and the discriminative information from labelled images to unlabelled ones.
The learned sub manifold preserves both the local geometry of labelled relevant images and the discriminative information to separate relevant from irrelevant images. As a consequence, we can eliminate the well-known semantic gap between low-level visual features and high-level semantics to further enhance the re ranking performance on this sub manifold.
Re ranking implementation process
The proposed general framework for active re ranking in Web image search. Take the query term “panda” as an example. When “panda” is submitted to the Web image search engine, an initial text-based search result is returned to the user. This result is unsatisfactory because both person and animal images are retrieved as top results. This is caused by the ambiguity of the query term. Without the user interactions, it is impossible to eliminate this ambiguity.
In particular, which kind of images, animal panda or person whose name is Panda, are user’s intention? Therefore, traditional re ranking methods, which improve the initial search results by only utilizing the visual property of images, cannot achieve good performances.
Active Sample Selection
An SInfo active sample selection strategy is presented to learn the user’s intention efficiently which selects images by considering not only the ambiguity but also the representativeness in the whole image database. Ambiguity and representativeness are two important aspects in active sample selection. Labeling a sample which is more ambiguous will bring more information.
On the other side, the information provided by individual sample can be shared by its neighbors. Therefore, the more representative samples are preferred for labeling. In SInfo , the ambiguity of an image is measured by the entropy of the relevance probability distribution while the representativeness is measured by the density.
The below links contains table of contents, screen shots, UML diagrams and documentation of Active Re-ranking for Web Image Search.