Stars
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin9…
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.
This repo is an implementation of PyTorch version YOLOV Series
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
🪄 Create rich visualizations with AI
Painter & SegGPT Series: Vision Foundation Models from BAAI
[CVPR22] Official Implementation of DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
[ACM MM 2022] Official Rail-DB and Rail-Net
[AAAI 2025] DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
NeuroNCAP benchmark for end-to-end autonomous driving
Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)
[CVPR 2023] CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
An intuitive approach for 3D Occupancy Detection
Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)
[CVPR 2024] SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
[T-PAMI] Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
Code of "OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments".
[CVPR 2022] "MonoScene: Monocular 3D Semantic Scene Completion": 3D Semantic Occupancy Prediction from a single image
Approaching (Almost) Any Machine Learning Problem中译版,在线文档地址:https://ytzfhqs.github.io/AAAMLP-CN/
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
Effortless data labeling with AI support from Segment Anything and other awesome models.