Data and Benchmarks from My Projects

The IndicTTI Benchmark for measuring bias in generative text-to-image (TTI) models across Indic Languages. Presented in our ECCV’24 paper Navigating Text-to-Image Generative Bias across Indic Languages.
A collection of 100, trained Genenerative models used in the project Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images.
Proactive Image Manipulation Detection Data and models.
For more information, please see our CVPR'22 paper.
IMGUR5K Handwriting Dataset
We provide a unique new data set for handwritten OCR and text image manipulation. Imgur5K offers around 135K annotated, handwritten English words from 5K images originally hosted publicly on Imgur.com. Please see our TPAMI'23 paper for more information.
The TextOCR data set
We offer ~ 1M high quality word annotations on TextVQA images allowing application of end-to-end reasoning on downstream tasks such as visual question answering or image captioning. Please see our CVPR'21 preprint for more information.
6DoF Pose annotations for our img2pose project
We provide 6DoF face pose annotations for faces in the WIDER FACE data set. These were used for training and evaluating our img2pose: direct 6DoF pose estimation without face detection or landmark localization. For more information please see our CVPR'21 paper.
SKU-110K data set and benchmark
Dataset for our CVPR2019 paper, Precise Detection in Densely Packed Scenes. The benchmark measures object detection scenes where images contain many objects, often appearing similar or even identical, positioned in close proximity. The 11,762 images in SKU-110k represent retail environments and average 147.4 bounding box-labeled objects (store shelf items) per image.
LFW3D
Frontal facing, strongly aligned LFW images generated using our frontalization method.
Unfiltered Faces for Gender and Age Classification
Dataset of face images, labeled for age, gender and identity, acquired by smart-phones and other mobile devices, and uploaded without manual filtering to online image repositories.
Dynamic Point-Cloud Data
Real scans and synthetic data For foreground / background motion segmentation.
Violent Flows Benchmark
Video benchmark for classification and detection of outbursts of violence in crowded scenes.
Action Similarity Labeling Benchmark (ASLAN)
Video benchmark for same/not-same classification of pairs of videos presenting human action in captured in the wild (YouTube).
YouTube Faces Benchmark
Video benchmark for same/not-same classification of pairs of videos presenting human faces. Modeled after the LFW benchmark.
LFW-a Data Set
Our own version of the LFW data set, aligned using commercial software.

CODE and DATA provided in this website are provided “as is”, without any guarantee made as to their suitability or fitness for any particular use. CODE may contain bugs, so use of this tool is at your own risk. We take no responsibility for any damage of any sort that may unintentionally be caused through the use of any of these resources.