More
Stars
This repo holds the implementation of PAVE: Patching and Adapting Video Large Language Models (CVPR2025)
Code release for RICA^2: Rubric-Informed, Calibrated Assessment of Actions (ECCV 2024)
Code release for "Deep Learning to Quantify Care Manipulation Activities in Neonatal Intensive Care Units"
Dataset of measurements from a low-cost single-photon camera used in our CVPR 2024 paper "Towards 3D Vision with Low-Cost Single-Photon Cameras"
[ICML 2024 Oral] LSH-Based Efficient Point Transformer (HEPT)
Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition"
Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"
Code release for ActionFormer (ECCV 2022)
Data repo for mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors
[CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"
Code for MRI data de-identification and pakcaging
Sample code for deploying Gadgetron image reconstruction in Azure
Gadgetron - Medical Image Reconstruction Framework
Hierarchical Image Pyramid Transformer - CVPR 2022 (Oral)
I use U-net to reconstruct the fresnel diffraction images.
A latent text-to-image diffusion model
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks (ICCVW 2021)
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.
[ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"
Code for "DONeRF Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks"