8000 changingivan / Starred · GitHub

More Web Proxy on the site http://driver.im/

changingivan

Follow

changingivan

Follow

6 followers · 1 following

Stars

ydf0509 / nb_log

pip install nb_log 各种日志handler和自动转化项目的任意print的效果。日志自动彩色炫酷，可点击控制台的日志自动精确跳转到pycharm的文件和行号。文件日志多进程切割安全。在10个最重要方面全方位超过loguru

Python 416 77 Updated May 19, 2025

ConardLi / easy-dataset

A powerful tool for creating fine-tuning datasets for LLM

JavaScript 6,606 689 Updated May 18, 2025

RulinShao / LightSeq

Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training

Python 208 9 Updated Aug 19, 2024

Qihoo360 / Light-R1

Python 696 47 Updated Apr 15, 2025

Qihoo360 / 360-LLaMA-Factory

Forked from hiyouga/LLaMA-Factory

adds Sequence Parallelism into LLaMA-Factory

Python 485 31 Updated May 13, 2025

SkyworkAI / Skywork-OR1

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 569 35 Updated May 14, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,773 276 Updated May 15, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 21,321 1,405 Updated May 16, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,966 226 Updated May 19, 2025

modelcontextprotocol / python-sdk

The official Python SDK for Model Context Protocol servers and clients

Python 12,444 1,440 Updated May 19, 2025

modelcontextprotocol / servers

Model Context Protocol Servers

JavaScript 47,237 5,329 Updated May 18, 2025

liaokongVFX / MCP-Chinese-Getting-Started-Guide

Model Context Protocol(MCP) 编程极速入门

1,757 105 Updated Apr 23, 2025

MobinX / awesome-mcp-list

A concise list for mcp servers

665 43 Updated Mar 30, 2025

ScienceOne-AI / DeepSeek-671B-SFT-Guide

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 683 87 Updated Mar 13, 2025

NVIDIA / nccl-tests

NCCL Tests

Cuda 1,106 278 Updated May 7, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

1,034 142 Updated Mar 21, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,772 295 Updated Mar 10, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,377 170 Updated May 15, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,356 597 Updated May 19, 2025

WangRongsheng / awesome-LLM-resources

🧑‍🚀 全世界最好的LLM资料总结（视频生成、Agent、辅助编程、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型） | Summary of the world's best LLM resources.

5,235 518 Updated May 19, 2025

pprp / Awesome-Efficient-MoE

Efficient Mixture of Experts for LLM Paper List

Python 66 3 Updated Dec 15, 2024

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,668 769 Updated May 19, 2025

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 7,549 1,235 Updated May 15, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,551 834 Updated Apr 29, 2025

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 12,122 1,229 Updated May 19, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 3,817 369 Updated May 19, 2025

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 1,875 191 Updated May 17, 2025

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,774 105 Updated Apr 3, 2025

upa / mscp

mscp: transfer files over multiple SSH (SFTP) connections

C 171 16 Updated Apr 16, 2025

FareedKhan-dev / train-deepseek-r1

Building DeepSeek R1 from Scratch

Jupyter Notebook 598 94 Updated Mar 21, 2025

0