Tabrizian

Iman Tabrizian Tabrizian

361 followers · 181 following

NVIDIA
Toronto, Canada

Achievements

x3 x3 x3

Achievements

x3 x3 x3

Highlights

Developer Program Member

Organizations

Stars

adalkiran / llama-nuts-and-bolts

A holistic way of understanding how Llama and its components run in practice, with code and detailed documentation.

Go 304 15 Updated Aug 20, 2024

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 3,976 352 Updated May 11, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 11,729 1,483 Updated Apr 24, 2025

kaist-cp / cs420

KAIST CS420: Compiler Design

500 33 Updated Apr 3, 2025

microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

C++ 708 182 Updated May 11, 2025

karpathy / makemore

An autoregressive character-level language model for making more things

Python 3,058 784 Updated Jun 4, 2024

karpathy / nn-zero-to-hero

Neural Networks: Zero to Hero

Jupyter Notebook 13,709 1,905 Updated Aug 18, 2024

prefix-dev / pixi

Package management made easy

Rust 4,383 279 Updated May 11, 2025

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models

Python 24,143 1,858 Updated May 9, 2025

likejazz / llama3.np

llama3.np is a pure NumPy implementation for Llama 3 model.

Python 980 80 Updated Apr 27, 2025

mamba-org / vscode-micromamba

A VSCode extension to generate development environments using micromamba and conda-forge package repository

TypeScript 94 14 Updated Feb 11, 2025

NVIDIA / cuda-checkpoint

CUDA checkpoint and restore utility

C 333 16 Updated Jan 27, 2025

IUCompilerCourse / Essentials-of-Compilation

A book about compiling Racket and Python to x86-64 assembly

TeX 1,412 150 Updated May 6, 2025

NVIDIA / warp

A Python framework for accelerated simulation, data generation and spatial computing.

Python 5,049 303 Updated May 11, 2025

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,822 107 Updated Jan 21, 2024

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 15,517 1,969 Updated May 12, 2025

dfm / extending-jax

Extending JAX with custom C++ and CUDA code

Python 395 23 Updated Aug 18, 2024

ericsnowcurrently / multi-core-python

Enabling CPython multi-core parallelism via subinterpreters.

246 6 Updated Aug 19, 2022

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,626 912 Updated Jul 1, 2024

ritchieng / the-incredible-pytorch

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

11,862 2,149 Updated Jan 8, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 17,300 1,675 Updated May 8, 2025

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 20,543 1,199 Updated May 11, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,458 1,419 Updated May 12, 2025

triton-inference-server / pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

Python 790 53 Updated Feb 12, 2025

jsbueno / extrainterpreters

Utilities for using Python's PEP 554 subinterpreters

Python 121 8 Updated Nov 16, 2024

pytorch / multipy

torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.

C++ 180 37 Updated Dec 13, 2024

Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations

Python 7,301 715 Updated May 6, 2025

karpathy / micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook 11,817 1,726 Updated Aug 8, 2024

frankmcsherry / blog

Some notes on things I find interesting and important.

JavaScript 1,996 178 Updated May 11, 2025

NVIDIA / stdexec

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 1,885 186 Updated May 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iman Tabrizian Tabrizian

Achievements

Achievements

Highlights

Organizations

Block or report Tabrizian

Stars

adalkiran / llama-nuts-and-bolts

ai-dynamo / dynamo

Jiayi-Pan / TinyZero

kaist-cp / cs420

microsoft / onnxruntime-genai

karpathy / makemore

karpathy / nn-zero-to-hero

prefix-dev / pixi

stanfordnlp / dspy

likejazz / llama3.np

mamba-org / vscode-micromamba

NVIDIA / cuda-checkpoint

IUCompilerCourse / Essentials-of-Compilation

NVIDIA / warp

S-LoRA / S-LoRA

triton-lang / triton

dfm / extending-jax

ericsnowcurrently / multi-core-python

karpathy / minbpe

ritchieng / the-incredible-pytorch

Dao-AILab / flash-attention

ml-explore / mlx

NVIDIA / TensorRT-LLM

triton-inference-server / pytriton

jsbueno / extrainterpreters

pytorch / multipy

Future-House / paper-qa

karpathy / micrograd

frankmcsherry / blog

NVIDIA / stdexec