8000 yuh-zha / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View yuh-zha's full-sized avatar

Highlights

  • Pro

Block or report yuh-zha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

651 26 Updated Jul 4, 2025

Tool for n-gram overlap analysis between test and training sequences

Jupyter Notebook 9 Updated Oct 17, 2023

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,188 177 Updated Jul 7, 2025

Multimodal Large Language Models for Code Generation under Multimodal Scenarios

98 4 Updated Jul 8, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,852 6,591 Updated Jul 8, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,934 225 Updated Jul 7, 2025

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

974 44 Updated Jun 18, 2025

Witness the aha moment of VLM with less than $3.

Python 3,826 290 Updated May 19, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,989 1,493 Updated Apr 24, 2025

Simple RL training for reasoning

Python 3,671 273 Updated Apr 10, 2025

Columbia Robot Studio Project Implementation: Code, CAD, 3mf, etc

Python 14 1 Updated Jun 19, 2025

Get your documents ready for gen AI

Python 33,929 2,258 Updated Jul 8, 2025

List of Computer Science courses with video lectures.

69,316 9,339 Updated Jul 2, 2025

A curated list of foundation models for vision and language tasks

1,046 52 Updated Jun 23, 2025

A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.

Python 558 50 Updated Jun 12, 2025

Pytorch implementation of Twelve Labs' Video Foundation Model evaluation framework & open embeddings

Python 27 Updated Aug 23, 2024

Optimized primitives for collective multi-GPU communication

C++ 3,844 955 Updated Jun 18, 2025

Examples and guides for using the OpenAI API

MDX 65,176 10,808 Updated Jul 8, 2025

Paper collections of the continuous effort start from World Models.

175 5 Updated Jul 6, 2024

A curated list of awesome self-supervised learning methods in videos

143 5 Updated May 4, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 15,040 1,273 Updated May 23, 2024

Pandora: Towards General World Model with Natural Language Actions and Video States

Python 504 34 Updated Sep 23, 2024

Code for FLAVR: A fast and efficient frame interpolation technique.

Python 499 74 Updated May 7, 2024

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,956 114 Updated Jun 16, 2025

This is a list of awesome prototype-based papers for explainable artificial intelligence.

38 1 Updated Dec 12, 2022

Tracking and collecting papers/projects/others related to Segment Anything.

1,645 133 Updated Mar 13, 2025

Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.

Python 562 52 Updated May 30, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,482 113 Updated Jun 20, 2025
Next
0