Code and Software Tools from My Projects

Code accompanying the IndicTTI (text-to-Image) dataset and paper.
For more details, see our ECCV’24 paper Navigating Text-to-Image Generative Bias across Indic Languages.
Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images
Estimating the hyperparameters of a generative model from a photo it generated. Official Pytorch implementation, data, and models used in our experiments. For more details, see project page for our TPAMI’23 paper.
Proactive Image Manipulation Detection
Protect your media from manipulations. Official Pytorch implementation and data release. For more information, please see our CVPR'22 paper.
Out-of-Distribution (OOD) Detection
Single layer network add-on which adds OOD detection capabilities. For more information, please see our NuriPS'21 paper.
Multiplexer: end-to-end, multilingual OCR
Multilingual OCR pipeline, including detection, segmentation, language identification, and character recognition. For more information please see our CVPR'21 paper.
HyperSeg - Official PyTorch Implementation
Code for our state-of-the-art, real-time, semantic segmentation method which uses a novel hyper-network approach. For more information please see our CVPR'21 paper.
img2pose implementation and data
State of the art, real-time face detection and 3D alignment by direct 6DoF face pose estimation. No more face detection or facial landmark localization models! For more information please see our CVPR'21 paper. We provide code, models, sample notebooks, pose annotations, and benchmark evaluation scripts.
FSGAN - Official PyTorch Implementation
Code and models for our subject agnostic face swapping and reenactment method. Please see FSGAN project page for the paper and more details.
Network layer for 3D face view generation
Used for face specific data augmentation: This layer renders novel face views during training, on-line, with minimal additional compute costs and no storage costs required for the augmented face images. The method is described in this paper.
Extreme 3D Face Reconstruction
Deep models and code for estimating detailed 3D face shapes, including facial expressions and viewpoint. This project extends the code used for our CNN3DMM project from our CVPR’17 paper. The method is described in this preprint. Docker also available, for easy install of model and code.
FaceExpressionNet (ExpNet)
Deep models and code for estimating the expression bases for a 3D face shape directly from image intensities and without the use of facial landmark detectors.
FacePoseNet
Deep, direct estimation of 6 degrees of freedom head pose for 2D and 3D face alignment. The Python code also includes fast rendering for new view synthesis of face photos to three poses, including frontalization. Network updated to ResNet-101 with considerable improvement to accuracy.
Temporal Tessellation
A Unified Approach for Video Analysis shown obtain state of the art results in video captioning (LSMDC’16 benchmark), video summarization (SumMe and TVSum benchmarks), and Temporal Action Detection (Thumos2014 benchmark).
Face Segmentation and Face Swapping
Code and deep models for face segmentation and swapping of unconstrained images and arbitrarily selected image pairs.
Very Deep Network for Regressing 3D Morphable Face Models (3DMM)
A fast, robust and discriminative method for estimating 3DMM parameters.
CLATCH: A CUDA Port for the LATCH Binary Descriptor
LATCH Local binary descriptors at breakneck extraction speeds.
Face Recognition With Augmented Data
Code, trained models and data are in preparation. Please see project page for updates.
Convolutional Neural Network for Facial Landmark Detection
Caffe models, code and example usage.
Code and trained Convolutional Neural Networks for emotion recognition from single face images.
GPU-Based Computation of 2D Least Median of Squares
Fast Least Median of Squares as a more robust substitute for 2D Least Squares, implemented on the GPU
The LATCH Binary Descriptor
The Learned Arrangements of Three Patch Codes (LATCH) Local binary descriptor, implemented as part of OpenCV 3.0.
Face Frontalization for Recognition
MATLAB code for synthesizing aggressively aligned, forward facing new views of faces in unconstrained images.
Convolutional Neural Networks for Age and Gender Classification
Used with the Adience benchmark of unfiltered face images.
Scale Propagation
For scale-invariant dense correspondence estimation across images of different scenes (used with, e.g., SIFT-Flow).
In-Plane Alignment of Faces
A robust face alignment technique which explicitly considers the uncertainties of facial feature detectors.
Identification of Larval Feeding Strikes
A framework for automated detection of prey acquisition strikes from a long video of foraging larvae.
MATLAB 3D Model Renderer
MATLAB functions for rendering textured 3D models and using them to calibrate (estimate 6DOF pose) of objects appearing in images. See inside for example usage in estimating head pose. Also available is code for our face animation demo.
Motion Interchange Patterns (MIP)
Code for computing the MIP video representation for action recognition.
Violent Vlows (ViF) Descriptor
Code for extracting the ViF video representation for violent action detection in videos of crowded scenes.
Scale-Less SIFT (SLS) Descriptor
Extracts the SLS descriptor on a dense grid, in order to allow for dense correspondences between images with varying scales.
Matched Background Similarity (MBGS) and baseline methods
Sources for computing the similarities of faces appearing in videos for face video verification (set-to-set similarities). Includes both our own MBGS as well as baseline methods tested.
The One-Shot Similarity (OSS) Kernel
MATLAB code for efficiently computing the OSS similarity kernel.
Patch LBP Code
MATLAB code for computing the Three-Patch LBP (TPLBP) and the Four-Patch LBP (FPLBP) descriptors for face and texture images.

CODE and DATA provided in this website are provided “as is”, without any guarantee made as to their suitability or fitness for any particular use. CODE may contain bugs, so use of this tool is at your own risk. We take no responsibility for any damage of any sort that may unintentionally be caused through the use of any of these resources.