8000 wdlctc (Cheng Luo) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View wdlctc's full-sized avatar

Highlights

  • Pro

Block or report wdlctc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 355 18 Updated May 19, 2025

minimal GRPO implementation from scratch

Python 90 11 Updated Mar 14, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,777 1,486 Updated Apr 24, 2025

A Lossless Compression Library for AI pipelines

Python 247 29 Updated Apr 27, 2025

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,483 48 Updated May 15, 2025
Python 20 2 Updated Feb 22, 2025

A Python library transfers PyTorch tensors between CPU and NVMe

C++ 115 25 Updated Nov 27, 2024

Mini versions of GPT2, LLama3, .. for pre-training

Python 2 Updated Jun 19, 2024

Everything about the SmolLM2 and SmolVLM family of models

Python 2,387 137 Updated Mar 31, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 49,225 5,993 Updated May 19, 2025
Python 46 6 Updated May 16, 2025

[ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Python 459 29 Updated Feb 10, 2025

[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623

Python 84 5 Updated Sep 26, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,922 197 Updated May 19, 2025
Python 50 3 Updated Oct 29, 2024

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 753 67 Updated Mar 14, 2025

Linear Attention Sequence Parallelism (LASP)

Python 82 3 Updated Jun 4, 2024

VideoSys: An easy and efficient system for video generation

Python 1,963 129 Updated Mar 9, 2025

Development repository for the Triton language and compiler

MLIR 15,612 1,984 Updated May 20, 2025

Latency and Memory Analysis of Transformer Models for Training and Inference

Python 414 46 Updated Apr 19, 2025

RTP: Rethinking Tensor Parallelism with Memory Deduplication

Python 11 Updated Dec 15, 2023

Selfplay In MultiPlayer Environments

Python 321 105 Updated Jun 12, 2024

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code

Python 592 196 Updated May 19, 2025

PyTorch layer-by-layer model profiler

Python 607 45 Updated May 23, 2021

A validation and profiling tool for AI infrastructure

Python 309 67 Updated May 19, 2025

An experimental parallel training platform

54 15 Updated Mar 25, 2024

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,483 2,254 Updated Apr 22, 2025

Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)

Python 182 48 Updated Nov 19, 2018
Next
0