8000 Tabrizian (Iman Tabrizian) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Tabrizian's full-sized avatar
  • NVIDIA
  • Toronto, Canada

Organizations

@NVIDIA @nuxt-community @kubeflow @triton-inference-server

Block or report Tabrizian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A holistic way of understanding how Llama and its components run in practice, with code and detailed documentation.

Go 304 15 Updated Aug 20, 2024

A Datacenter Scale Distributed Inference Serving Framework

Rust 3,976 352 Updated May 11, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,729 1,483 Updated Apr 24, 2025

KAIST CS420: Compiler Design

500 33 Updated Apr 3, 2025

Generative AI extensions for onnxruntime

C++ 708 182 Updated May 11, 2025

An autoregressive character-level language model for making more things

Python 3,058 784 Updated Jun 4, 2024

Neural Networks: Zero to Hero

Jupyter Notebook 13,709 1,905 Updated Aug 18, 2024

Package management made easy

Rust 4,383 279 Updated May 11, 2025

DSPy: The framework for programming—not prompting—language models

Python 24,143 1,858 Updated May 9, 2025

llama3.np is a pure NumPy implementation for Llama 3 model.

Python 980 80 Updated Apr 27, 2025

A VSCode extension to generate development environments using micromamba and conda-forge package repository

TypeScript 94 14 Updated Feb 11, 2025

CUDA checkpoint and restore utility

C 333 16 Updated Jan 27, 2025

A book about compiling Racket and Python to x86-64 assembly

TeX 1,412 150 Updated May 6, 2025

A Python framework for accelerated simulation, data generation and spatial computing.

Python 5,049 303 Updated May 11, 2025

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,822 107 Updated Jan 21, 2024

Development repository for the Triton language and compiler

MLIR 15,517 1,969 Updated May 12, 2025

Extending JAX with custom C++ and CUDA code

Python 395 23 Updated Aug 18, 2024

Enabling CPython multi-core parallelism via subinterpreters.

246 6 Updated Aug 19, 2022

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,626 912 Updated Jul 1, 2024

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

11,862 2,149 Updated Jan 8, 2025

Fast and memory-efficient exact attention

Python 17,300 1,675 Updated May 8, 2025

MLX: An array framework for Apple silicon

C++ 20,543 1,199 Updated May 11, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,458 1,419 Updated May 12, 2025

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

Python 790 53 Updated Feb 12, 2025

Utilities for using Python's PEP 554 subinterpreters

Python 121 8 Updated Nov 16, 2024

torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.

C++ 180 37 Updated Dec 13, 2024

High accuracy RAG for answering questions from scientific documents with citations

Python 7,301 715 Updated May 6, 2025

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook 11,817 1,726 Updated Aug 8, 2024

Some notes on things I find interesting and important.

JavaScript 1,996 178 Updated May 11, 2025

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 1,885 186 Updated May 10, 2025
Next
0