8000 changingivan / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View changingivan's full-sized avatar

Block or report changingivan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

pip install nb_log 各种日志handler和自动转化项目的任意print的效果。日志自动彩色炫酷,可点击控制台的日志自动精确跳转到pycharm的文件和行号。文件日志多进程切割安全。在10个最重要方面全方位超过loguru

Python 416 77 Updated May 19, 2025

A powerful tool for creating fine-tuning datasets for LLM

JavaScript 6,606 689 Updated May 18, 2025

Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training

Python 208 9 Updated Aug 19, 2024
Python 696 47 Updated Apr 15, 2025

adds Sequence Parallelism into LLaMA-Factory

Python 485 31 Updated May 13, 2025

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 569 35 Updated May 14, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,773 276 Updated May 15, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 21,321 1,405 Updated May 16, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,966 226 Updated May 19, 2025

The official Python SDK for Model Context Protocol servers and clients

Python 12,444 1,440 Updated May 19, 2025

Model Context Protocol Servers

JavaScript 47,237 5,329 Updated May 18, 2025

Model Context Protocol(MCP) 编程极速入门

1,757 105 Updated Apr 23, 2025

A concise list for mcp servers

665 43 Updated Mar 30, 2025

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 683 87 Updated Mar 13, 2025

NCCL Tests

Cuda 1,106 278 Updated May 7, 2025

Analyze computation-communication overlap in V3/R1.

1,034 142 Updated Mar 21, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,772 295 Updated Mar 10, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,377 170 Updated May 15, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,356 597 Updated May 19, 2025

🧑‍🚀 全世界最好的LLM资料总结(视频生成、Agent、辅助编程、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.

5,235 518 Updated May 19, 2025

Efficient Mixture of Experts for LLM Paper List

Python 66 3 Updated Dec 15, 2024

DeepEP: an efficient expert-parallel communication library

Cuda 7,668 769 Updated May 19, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,549 1,235 Updated May 15, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,551 834 Updated Apr 29, 2025

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 12,122 1,229 Updated May 19, 2025

A PyTorch native platform for training generative AI models

Python 3,817 369 Updated May 19, 2025

Minimalistic large language model 3D-parallelism training

Python 1,875 191 Updated May 17, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,774 105 Updated Apr 3, 2025

mscp: transfer files over multiple SSH (SFTP) connections

C 171 16 Updated Apr 16, 2025

Building DeepSeek R1 from Scratch

Jupyter Notebook 598 94 Updated Mar 21, 2025
Next
0