8000 ftgreat (ldwang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View ftgreat's full-sized avatar

Block or report ftgreat

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

CUDA on non-NVIDIA GPUs

Rust 11,915 754 Updated Jul 4, 2025
8000

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 712 33 Updated Mar 19, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python 1,371 172 Updated Jul 3, 2025

A scalable, end-to-end training pipeline for general-purpose agents

Python 173 23 Updated Jul 4, 2025

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

Python 666 34 Updated Jun 11, 2025

这是一个从零学习CUDA课程

C++ 10 3 Updated Nov 3, 2024

Flash Dynamic Mask Attention

C++ 11 2 Updated Jul 5, 2025

所有小初高、大学PDF教材。

Roff 43,717 9,764 Updated May 18, 2025

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 8,051 590 Updated Jan 3, 2025
Python 476 40 Updated Mar 13, 2025

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 711 91 Updated Mar 13, 2025

🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

TypeScript 29,546 2,579 Updated Jul 6, 2025

Scaling RL on advanced reasoning models

Python 364 16 Updated Jul 2, 2025

adds Sequence Parallelism into LLaMA-Factory

Python 525 35 Updated Jul 1, 2025

slime is a LLM post-training framework aiming at scaling RL.

Python 538 33 Updated Jul 6, 2025

✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】

Jupyter Notebook 11,520 1,387 Updated Jun 23, 2025

NanoGPT (124M) in 3 minutes

Python 2,756 349 Updated Jun 20, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 42,578 7,120 Updated Dec 9, 2024

TransMLA: Multi-Head Latent Attention Is All You Need

Python 323 22 Updated Jul 4, 2025

A Pytorch tutorial of Conditional Flow Matching[Lipman22] using MNIST dataset.

Jupyter Notebook 12 1 Updated Dec 29, 2024

Nano vLLM

Python 4,900 572 Updated Jun 27, 2025

Serverless LLM Serving for Everyone.

Python 498 49 Updated Jul 3, 2025

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 1,774 141 Updated Jul 2, 2025

Fast Semantic Text Deduplication & Filtering

Python 756 44 Updated May 27, 2025

Model Context Protocol Servers

TypeScript 57,920 6,697 Updated Jul 4, 2025

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 13,214 1,421 Updated Jul 6, 2025

ACL 2025: Synthetic data generation pipelines for text-rich images.

Python 86 15 Updated Mar 1, 2025
Python 138 15 Updated Jul 21, 2024

Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Python 178 21 Updated Jun 21, 2025

一个面向大语言模型(LLM)的智能数据集构建工具

TypeScript 31 1 Updated Jun 27, 2025
Next
0