Stars
[NAACL 2024] Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffu 8000 sion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Video-LlaVA fine-tune for CinePile evaluation
[AAAI 2025] Official implementation of "TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment"
LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window Transformer
❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119
CIS490: Project Management & Practice Final Project
mrseanryan / finetune_LLaVA
Forked from bdytx5/finetune_LLaVAFine tune LLaVA 1.5 - based on article by wandb
From scratch implementation of a vision language model in pure PyTorch
Instruction/chat prompts creation library for text generation LLMs. It supports local and Hugging Face models.
A 4-hour coding workshop to understand how LLMs are implemented and used
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Python code for part 2 of the book Causal Inference: What If, by Miguel Hernán and James Robins
[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
A collection of Jupyter notebook examples for using GeoAI
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
bilibili video course src code
This repository contains the Hugging Face Agents Course.
Embark on the "Reinforcement Learning from Human Feedback" course and align Large Language Models (LLMs) with human values.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.