8000 laserwave (zhikaizhang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View laserwave's full-sized avatar
  • horizon robotics
  • nanjing, china

Block or report laserwave

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 2,246 237 Updated May 26, 2025

Official repo for CFG-Zero*

Python 575 20 Updated May 2, 2025

a family of versatile and state-of-the-art video tokenizers.

Python 392 19 Updated Apr 5, 2025

VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE

Python 327 7 Updated Jan 19, 2025

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…

Python 280 15 Updated Mar 12, 2025

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 495 35 Updated Feb 3, 2025

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support

Python 593 54 Updated Mar 22, 2024

A curated list of Diffusion Model in RL resources (continually updated)

1,181 61 Updated Feb 15, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 11,958 1,382 Updated May 27, 2025

Code for Scaling Language-Free Visual Representation Learning (WebSSL)

245 2 Updated Apr 24, 2025

Awesome RL Reasoning Recipes ("Triple R")

618 33 Updated Jun 4, 2025

GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning

Python 138 5 Updated May 21, 2025

Dream 7B, a large diffusion language model

Python 724 30 Updated May 31, 2025

[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert

Python 1,425 83 Updated Feb 18, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 965 45 Updated May 24, 2025

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,661 915 Updated May 26, 2025

The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".

Python 173 20 Updated Mar 28, 2025

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Python 2,089 626 Updated Aug 9, 2023

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)

Python 6,953 676 Updated Jun 4, 2025

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,913 187 Updated May 19, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,065 309 Updated May 11, 2025

Witness the aha moment of VLM with less than $3.

Python 3,721 286 Updated May 19, 2025

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 485 22 Updated Jan 13, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 6,460 553 Updated Jun 4, 2025

Fully open reproduction of DeepSeek-R1

Python 24,662 2,281 Updated Jun 2, 2025

Official Pytorch Implementation for "VidToMe: Video Token Merging for Zero-Shot Video Editing" (CVPR 2024)

Python 219 13 Updated Jan 22, 2025

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

Python 185 22 Updated Sep 3, 2023

A method to increase the speed and lower the memory footprint of existing vision transformers.

Python 1,061 72 Updated Jun 17, 2024

Official repository of the xLSTM.

Python 1,882 144 Updated May 28, 2025
Next
0