VIPeR: Viewpoint Invariant Pedestrian Recognition

In order to evaluate the performance of an appearance model in a modern surveillance system, one requires a dataset which contains a significant amount of viewpoint and illumination variation. The VIPeR dataset contains 632 pedestrian image pairs taken from arbitrary viewpoints under varying illumination conditions. The data were collected in an academic setting over the course of several months. Each image is scaled to 128x48 pixels. The complete dataset can be downloaded here:

For detailed instructions on how to evaluate performance for recognition, reacquisition or tracking, please refer to the following paper:

D. Gray, S. Brennan, and H. Tao, "Evaluating Appearance Models for Recognition, Reacquisition, and Tracking," Performance Evaluation of Tracking and Surveillance (PETS). IEEE International Workshop on, 2007.

For comparisons to our latest work on the subject, see:


D. Gray, and H. Tao, "Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features," in Proc. European Conference on Computer Vision (ECCV), 2008.


If you use our dataset, found it useful, or have some results to share, please do let us know.  We will try top keep a list of the latest results here.