Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
R1-onevision, a visual language model capable of deep CoT reasoning.
A brief and partial summary of RLHF algorithms.
VLM2-Bench [ACL 2025 Main]: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues
Paper List of Inference/Test Time Scaling/Computing
Building a comprehensive and handy list of papers for GUI agents
Exploring common-sense reasoning capabilities of text-to-image models through pronoun disambiguation
A list for Text-to-Video, Image-to-Video works
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
A nanoGPT pipeline packed in a spreadsheet
A curated list of papers and resources based on "Large Language Models on Graphs: A Comprehensive Survey" (TKDE)
DSPy: The framework for programming—not prompting—language models
✨✨Latest Advances on Multimodal Large Language Models
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
A curated list of reinforcement learning with human feedback resources (continually updated)
使用 Prompts 和 Chains 让 ChatGPT 成为神奇的生产力工具!Unlocking the power of LLMs.
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".
A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…