-
Peking University
Lists (12)
Sort Name ascending (A-Z)
Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
🔥🔥First-ever hour scale video understanding models
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Tool for automating common video key-frame extraction, video compression and Image Auto-crop/Image-resize tasks
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
[CVPR 2025 Highlight] Official implementation of "Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity"
✨✨Latest Advances on Multimodal Large Language Models
Example models using DeepSpeed
Official repository of the paper "Exploring What Why and How: A Multifaceted Benchmark for Causation Understanding of Video Anomaly"
[CVPR 2024] Official repository of the paper "Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly"
📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.
Fast and memory-efficient exact attention
Enable macOS HiDPI and have a native setting.
Papers for Video Anomaly Detection, released codes collection, Performance Comparision.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and questions and answers.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
The newest solution for CS224n: Stanford NLP.(作业代码实现)
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
[ICLR2022] official implementation of UniFormer
Official repository for the AAAI2025 paper (Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization through Spare-Cod…