Stars
Codebase for "Jodi: Unification of Visual Generation and Understanding via Joint Modeling"
This repository is the official implementation of the paper "Understanding Few-Shot Learning: Measuring Task Relatedness and Adaptation Difficulty via Attributes" in Neural Information Processing S…
This repository contains the reference source code for the paper ["Scalable Modular Network: A Framework for Adaptive Learning via Agreement Routing"](https://openreview.net/forum?id=pEKJl5sflp) in…
official codes for our WACV 2024 paper (Interpretable Object Recognition by Semantic Prototype Analysis)
Codes for the WACV 2023 paper: "Semantic Guided Latent Parts Embedding for Few-Shot Learning"
Breaking Boundary Between Pre-training and Fine-tuning with Hybrid Prompting for Knowledge-Based VQA
Custom node and script for sending webcam to ComfyUI
Official implementation of BMVC2023 Oral paper: 《Describe Your Facial Expressions by Linking Image Encoders and Large Language Models》
State-of-the-art 2D and 3D Face Analysis Project
[WACV'25 Oral] Precise Integral in NeRFs: Overcoming the Approximation Errors of Numerical Quadrature
Build image datasets based on torch and torchvision.
Implement visual tokenizers with PyTorch.
⚡Batch Face Processing for Fast Modern Research, including face detection, face alignment, face reconstruction, head pose estimation, face parsing
Resources about Sign Language Processing (e.g., Sign Language Recognition / Translation / Production)
[ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
Codebase of ICCV 2023 paper "Hierarchical Contrastive Learning for Pattern-Generalizable Image Corruption Detection"
ICCV23 "Householder Projector for Unsupervised Latent Semantics Discovery"
“百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Ch…
My best practice of training large dataset using PyTorch.
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
[NeurIPS 2022] Official pytorch implementation of "Towards Diverse and Faithful One-shot Domain Adaption of Generative Adversarial Networks"
Implement Diffusion Models with PyTorch.