Stars
This repo hosts the code and models of "Masked Autoencoders that Listen".
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Single-Side Domain Generalization for Face Anti-Spoofing, CVPR2020
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Automatically generate and overlay subtitles for any video.
✨✨Latest Advances on Multimodal Large Language Models
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
A curated list of Artificial Intelligence Top Tools
An easy/swift-to-adapt PyTorch-Lighting template. 套壳模板,简单易用,稍改原来Pytorch代码,即可适配Lightning。You can translate your previous Pytorch code much easier using this template, and keep your freedom to edit a…
Foundational Models for State-of-the-Art Speech and Text Translation
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm
Faster Whisper transcription with CTranslate2
A dataset for Audio-Visual Sound Event Detection in Movies
Open-source and strong foundation image recognition models.
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
LLaMA: Open and Efficient Foundation Language Models
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
🔊 Text-Prompted Generative Audio Model
PyTorch code and models for the DINOv2 self-supervised learning method.
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
🥣 AIGC 提示词可视化编辑器 | OPS | Open Prompt Studio
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything