8000 xeon27 (Omkar Dige) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View xeon27's full-sized avatar

Block or report xeon27

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GitHub's official MCP Server

Go 17,013 1,288 Updated Jul 8, 2025
Python 1,379 201 Updated Jun 26, 2025

Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.

Go 8,759 845 Updated Jul 8, 2025

Here, I implement every single component in typical LLM architectures from scratch: from data preparation to multihead self attention modules to instruction fine tuning of open source models!

Python 6 Updated Apr 8, 2025
Python 20 2 Updated Apr 17, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,956 1,562 Updated Jul 8, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 39,249 4,456 Updated Jul 8, 2025

This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…

Jupyter Notebook 3,395 431 Updated Jun 27, 2025

LLM inference in C/C++

C++ 82,733 12,294 Updated Jul 8, 2025

Official Implementation (Pytorch) of "Inversion-based Latent Bayesian Optimization", NeurIPS 2024

Python 9 Updated Nov 15, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,843 6,589 Updated Jul 8, 2025

This repository houses supplementary code and Jupyter notebooks that accompany the AI Pocket Reference project.

Cuda 3 Updated Mar 31, 2025

An ML Systems Onboarding list

829 30 Updated Jan 24, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,321 368 Updated Jul 8, 2025

Tile primitives for speedy kernels

Cuda 2,503 160 Updated Jul 7, 2025

CUDA Kernel Benchmarking Library

Cuda 678 79 Updated Jul 8, 2025

CUDA Core Compute Libraries

C++ 1,736 234 Updated Jul 8, 2025

Making large AI models cheaper, faster and more accessible

Python 41,013 4,521 Updated Jul 4, 2025

NumPy & SciPy for GPU

Python 10,325 924 Updated Jul 7, 2025

cuGraph - RAPIDS Graph Analytics Library

Cuda 1,992 331 Updated Jul 8, 2025

cuML - RAPIDS Machine Learning Library

C++ 4,812 577 Updated Jul 8, 2025

cuDF - GPU DataFrame Library

C++ 9,028 956 Updated Jul 8, 2025

LLM training in simple, raw C/CUDA

Cuda 27,094 3,115 Updated Jun 26, 2025

A streamlined reference manual for AI practitioners, students, and developers to quickly look up core concepts and implementations.

Handlebars 25 7 Updated Jun 23, 2025

LongBench v2 and LongBench (ACL 25'&24')

Python 924 93 Updated Jan 15, 2025

Fast and memory-efficient exact attention

Python 18,245 1,789 Updated Jul 6, 2025

🤗 smolagents: a barebones library for agents that think in code.

Python 21,028 1,833 Updated Jul 8, 2025

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

Python 2,557 78 Updated Jul 8, 2025
Next
0