8000 SeuTao (Tao Shen) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View SeuTao's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SeuTao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Python 40 2 Updated May 18, 2025

Autoregressive Image Generation with Randomized Parallel Decoding

Python 61 Updated Apr 1, 2025

[CVPR 2025 (Oral)] Open implementation of "RandAR"

Python 143 3 Updated Mar 20, 2025
Python 445 12 Updated May 18, 2025

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Python 200 7 Updated May 17, 2025

PyTorch - FID calculation with proper image resizing and quantization steps [CVPR 2022]

Python 1,067 72 Updated Jul 23, 2024

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 586 35 Updated Oct 16, 2024
Python 178 10 Updated May 14, 2025

Official repo for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editing

Python 121 8 Updated May 15, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 894 22 Updated May 15, 2025

Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".

Python 119 7 Updated May 12, 2025

DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.

TypeScript 10,636 987 Updated May 18, 2025

An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 546 19 Updated May 18, 2025

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Python 86 1 Updated Apr 8, 2025

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 240 9 Updated Apr 30, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 2,588 193 Updated May 16, 2025

Official implementation of Character Region Awareness for Text Detection (CRAFT)

Python 3,245 923 Updated Jul 16, 2024

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Python 94 4 Updated May 12, 2025

[Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation

Python 255 10 Updated Apr 10, 2025

Official repository of T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Python 285 16 Updated May 12, 2025

ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning

32 Updated Apr 6, 2025

RepText: Rendering Visual Text via Replicating 🔥

74 5 Updated May 1, 2025

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 1,487 101 Updated Mar 7, 2025
21 1 Updated Apr 15, 2025

Official PyTorch implementation of FlowMo.

Jupyter Notebook 58 4 Updated Apr 7, 2025

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 1,245 56 Updated May 13, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,036 164 Updated May 14, 2025

A Video Tokenizer Evaluation Dataset

Python 115 8 Updated Jan 13, 2025

New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

Jupyter Notebook 7,966 512 Updated Apr 29, 2025
Python 483 28 Updated Apr 29, 2025
Next
0