lonngxiang

dragon10 lonngxiang

life is tough , keep fighting

28 followers · 66 following

China

Achievements

Starred repositories

OpenMOSS / SpeechGPT-2.0-preview

GPT-4o-level, real-time spoken dialogue system.

Python 338 23 Updated Jan 27, 2025

punkpeye / awesome-mcp-servers

A collection of MCP servers.

58,650 4,522 Updated Jun 29, 2025

google-deepmind / videoprism

Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)

Python 172 13 Updated Jun 28, 2025

showlab / Show-o

[ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,546 67 Updated Jun 30, 2025

leaningtech / webvm

Virtual Machine for the Web

JavaScript 14,314 2,571 Updated Jun 18, 2025

ictnlp / Stream-Omni

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.

Python 213 11 Updated Jun 17, 2025

Yuliang-Liu / MonkeyOCR

A lightweight LMM-based Document Parsing Model

Python 3,257 212 Updated Jun 30, 2025

Tencent-Hunyuan / Hunyuan3D-2.1

From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

Python 1,435 121 Updated Jun 26, 2025

Fosowl / agenticSeek

Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin9…

Python 19,639 1,917 Updated Jun 28, 2025

VITA-MLLM / VITA-Audio

✨✨VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

Python 602 49 Updated May 24, 2025

maitrix-org / Voila

Python 417 40 Updated May 6, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,217 247 Updated Jun 12, 2025

MoonshotAI / Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 3,887 260 Updated Jun 21, 2025

MiniMax-AI / MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

Python 2,467 184 Updated Jun 19, 2025

jingyaogong / minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 3,929 395 Updated Apr 27, 2025

gkamradt / MultiTerminalCodeViz

TypeScript 221 24 Updated Jun 29, 2025

Perceive-Anything / PAM

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Jupyter Notebook 217 8 Updated Jun 26, 2025

inclusionAI / Ming

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Python 357 23 Updated Jun 27, 2025

SkyworkAI / SkyReels-A2

SkyReels-A2: Compose anything in video diffusion transformers

Python 617 58 Updated Jun 3, 2025

tianweiy / CausVid

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 680 31 Updated May 17, 2025

davila7 / mcp-courses

MCP: Build Rich-Context AI Apps with Anthropic

Python 39 7 Updated Jun 25, 2025

PKU-YuanGroup / UniWorld-V1

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 595 20 Updated Jun 26, 2025

google-gemini / gemini-fullstack-langgraph-quickstart

Get started with building Fullstack Agents using Gemini 2.5 and LangGraph

Jupyter Notebook 14,857 2,401 Updated Jun 18, 2025

sayakpaul / nanoDiT

Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.

Python 106 12 Updated May 29, 2025

MiniMax-AI / One-RL-to-See-Them-All

The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning

Python 284 15 Updated May 31, 2025

anthropics / claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 16,449 903 Updated Jun 25, 2025

Skywork-ai / Skywork-Super-Agents

Python 42 9 Updated May 21, 2025

MiniMax-AI / MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Python 734 108 Updated Jun 26, 2025

SkyworkAI / DeepResearchAgent

Fluent 971 157 Updated Jun 30, 2025

antvis / mcp-server-chart

🤖 A visualization Model Context Protocol server for generating 25+ visual charts using @antvis.

TypeScript 1,607 152 Updated Jun 30, 2025

dragon10 lonngxiang

Starred repositories

cell-segmentation

joint-bert

Python