8000 fangfang11-plog (LihuangFang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View fangfang11-plog's full-sized avatar

Block or report fangfang11-plog

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Train transformer language models with reinforcement learning.

Python 14,147 1,956 Updated Jun 12, 2025

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Python 1,208 62 Updated Jul 17, 2024

[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding

Python 107 1 Updated May 22, 2025

Collection of papers and resources on Multimodal Reasoning, including Vision-Language Models, Multimodal Chain-of-Thought, Visual Inference, and others.

5 2 Updated Jan 14, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,551 538 Updated May 3, 2024

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,805 334 Updated May 21, 2024

欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩‍🎓👨‍🎓

Python 769 66 Updated Jun 4, 2025

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 7,962 865 Updated Apr 30, 2025

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,383 109 Updated Jun 4, 2025

心理健康大模型 (LLM x Mental Health), Pre & Post-training & Dataset & Evaluation & Depoly & RAG, with InternLM / Qwen / Baichuan / DeepSeek / Mixtral / LLama / GLM series models

Python 1,473 186 Updated May 18, 2025

Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.

27 Updated Aug 2, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 52,152 6,300 Updated Jun 12, 2025

LLM&VLM Tutorial

Python 1,828 1,571 Updated May 5, 2025

Ridge SfM Structure from Motion via robust pairwise matching under depth uncertainty

Python 116 15 Updated Aug 3, 2021

《动手学大模型Dive into LLMs》系列编程实践教程

Jupyter Notebook 5,603 515 Updated Jun 12, 2025

[ICASSP 2025] Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Python 14 Updated Nov 21, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,177 80 Updated Jan 23, 2025

[AAAI 2025] DepthFM: Fast Monocular Depth Estimation with Flow Matching

Jupyter Notebook 582 40 Updated May 6, 2025

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Python 1,219 73 Updated Jan 22, 2025

[NeurIPS 2024] Geometry-Aware Large Reconstruction Model for Efficient and High-Quality 3D Generation

Python 169 8 Updated Sep 30, 2024

[ECCV2024] Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding

Python 119 6 Updated Jul 2, 2024

[CVPR-2024] Pytorch implementation of "Misalignment-Robust Frequency Distribution Loss for Image Transformation"

Python 49 3 Updated Sep 26, 2024

[NeurIPS 2024] MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing

Python 117 2 Updated Nov 9, 2024
Python 378 25 Updated Jun 6, 2024

[CVPR2024] VideoBooth: Diffusion-based Video Generation with Image Prompts

Python 296 11 Updated Jun 9, 2024
Python 26 Updated Jul 5, 2024

[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Python 1,569 156 Updated Mar 27, 2025

The official PyTorch implementation of the paper "Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation".

Python 262 50 Updated Jul 23, 2023

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 3,878 571 Updated Apr 24, 2024

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 567 43 Updated Jun 7, 2024
Next
0