10000 MartinForReal's list / ml/data · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View MartinForReal's full-sized avatar

Organizations

@kubeflow

Block or report MartinForReal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

ml/data

41 repositories

A curated list of references for MLOps

13,227 1,956 Updated Nov 21, 2024

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Python 6,098 688 Updated Jul 10, 2025

lakeFS - Data version control for your data lake | Git for data

Go 4,770 381 Updated Jul 15, 2025

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 8,148 1,873 Updated Jul 14, 2025

Data-Centric Pipelines and Data Versioning

Go 6,242 568 Updated Feb 3, 2025

Examples of using Neptune to keep track of your experiments (maintenance only).

Jupyter Notebook 26 13 Updated Mar 30, 2022

A low-latency prediction-serving system

C++ 1,416 281 Updated Apr 26, 2021

Lingvo

Python 2,845 450 Updated Jun 18, 2025

🦉 Data Versioning and ML Experiments

Python 14,661 1,234 Updated Jul 15, 2025

A kubernetes based framework for hassle free handling of datasets

Go 522 70 Updated Jun 19, 2025

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replica…

Go 25,193 2,435 Updated Jul 15, 2025

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

Go 11,914 1,054 Updated Jul 15, 2025
C 25 14 Updated May 19, 2021

For recording and retrieving metadata associated with ML developer and data scientist workflows.

C++ 651 164 Updated Apr 3, 2025

PyTorch extensions for high performance and large scale training.

Python 3,337 289 Updated Apr 26, 2025

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Python 14,229 1,825 Updated Jul 3, 2024

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 7,029 2,947 Updated Apr 29, 2025

Spark RAPIDS plugin - accelerate Apache Spark with GPUs

Scala 910 256 Updated Jul 15, 2025

Open source platform for the machine learning lifecycle

Python 21,235 4,681 Updated Jul 15, 2025

Distributed ML Training and Fine-Tuning on Kubernetes

Python 1,847 790 Updated Jul 14, 2025

Resource scheduling and cluster management for AI

JavaScript 2,666 547 Updated Jun 6, 2024

NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions

Python 27 14 Updated Jul 15, 2025

A latent text-to-image diffusion model

Jupyter Notebook 71,138 10,485 Updated Jun 18, 2024

The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.

Python 2,667 521 Updated Jul 11, 2025

Efficient vision foundation models for high-resolution generation and perception.

Python 2,981 229 Updated Apr 24, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,149 263 Updated Jul 10, 2025

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,407 275 Updated Jun 19, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,049 2,546 Updated Aug 12, 2024
Python 1,894 299 Updated Apr 19, 2024

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

Python 2,777 374 Updated Jan 7, 2025
0