8000 lonngxiang (dragon10) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View lonngxiang's full-sized avatar
  • China

Block or report lonngxiang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

GPT-4o-level, real-time spoken dialogue system.

Python 338 23 Updated Jan 27, 2025

A collection of MCP servers.

58,650 4,522 Updated Jun 29, 2025

Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)

Python 172 13 Updated Jun 28, 2025

[ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,546 67 Updated Jun 30, 2025

Virtual Machine for the Web

JavaScript 14,314 2,571 Updated Jun 18, 2025

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.

Python 213 11 Updated Jun 17, 2025

A lightweight LMM-based Document Parsing Model

Python 3,257 212 Updated Jun 30, 2025

From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

Python 1,435 121 Updated Jun 26, 2025

Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin9…

Python 19,639 1,917 Updated Jun 28, 2025

✨✨VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

Python 602 49 Updated May 24, 2025
Python 417 40 Updated May 6, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,217 247 Updated Jun 12, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 3,887 260 Updated Jun 21, 2025

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

Python 2,467 184 Updated Jun 19, 2025

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 3,929 395 Updated Apr 27, 2025
TypeScript 221 24 Updated Jun 29, 2025

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Jupyter Notebook 217 8 Updated Jun 26, 2025

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Python 357 23 Updated Jun 27, 2025

SkyReels-A2: Compose anything in video diffusion transformers

Python 617 58 Updated Jun 3, 2025

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 680 31 Updated May 17, 2025

MCP: Build Rich-Context AI Apps with Anthropic

Python 39 7 Updated Jun 25, 2025

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 595 20 Updated Jun 26, 2025

Get started with building Fullstack Agents using Gemini 2.5 and LangGraph

Jupyter Notebook 14,857 2,401 Updated Jun 18, 2025

Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.

Python 106 12 Updated May 29, 2025

The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning

Python 284 15 Updated May 31, 2025

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 16,449 903 Updated Jun 25, 2025

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Python 734 108 Updated Jun 26, 2025

🤖 A visualization Model Context Protocol server for generating 25+ visual charts using @antvis.

TypeScript 1,607 152 Updated Jun 30, 2025
Next
0