8000 ved1beta (वेदांत) · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View ved1beta's full-sized avatar
🍊
santra
🍊
santra

Sponsors

@aman-17

Block or report ved1beta

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ved1beta/README.md

image

Things I Do: )

  • Triton: making custom triton kernels for better optimizations, working on some big kernel projects
  • Cuda: cuda architecture for better understanding of kernels and triton
  • Deep Learning: comp vision, NLP etc. : )

Technical Skills 🛠️

  • Languages: Python, CUDA, C++
  • Frameworks & Libraries: Pytorch, Pandas, Matplotlib, triton, Mpi4py
  • Tools & Platforms: GitHub, Docker, Vercel, Neovim, Vscode, Jupyter Notebook, Aws
  • Machine Learning Specialist: Proficient in statistical analysis, predictive modeling (Regression, Decision Trees, Random Forest), and advanced algorithms (CatBoost, SGD) with strong focus on optimization and accuracy.

Key Projects 📚

CUDA

  • GPU Sanghathan: Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
  • Cuda writer: writing cuda kernels from scratch vec_add to flash_attention and model implementation from scratch.
  • Flash attention: Implementation of flash attention in tritonutilization

Machine learning

  • Paligemma-Google: Implemented paligemma vision language model by google from scratch paper

  • Transformer: Implemented Transformer language model by Google from scratch paper

  • Mixture of Experts: Mixture of Experts (MoE) model with a focus on efficient routing and expert

  • Triton/CUDA kernels in my free time : )

Connect with Me 📬

  • 🐦 Twitter
  • 📫 Email
  • 🔗 LinkedIn I'm looking forward to collaborating on projects that are at the intersection of technology and social good. Let's connect! 🌍

Pinned Loading

  1. intel/neural-compressor intel/neural-compressor Public

    SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

    Python 2.4k 265

  2. vllm-project/llm-compressor vllm-project/llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    Python 1.3k 123

  3. bitsandbytes-foundation/bitsandbytes bitsandbytes-foundation/bitsandbytes Public

    Accessible large language models via k-bit quantization for PyTorch.

    Python 7k 691

  4. GPU-sanghathan GPU-sanghathan Public

    Small scale distributed training of sequential deep learning models, built on Numpy and MPI.

    Python 3

  5. Paligemma Paligemma Public

    vision language model

    Python 2

  6. Cuda_writer Cuda_writer Public

    Distributed training

    Jupyter Notebook 1

0