Starred repositories
Fully open reproduction of DeepSeek-R1
🦜🔗 Build context-aware reasoning applications
Proxy solution to run elegant Web UIs or interact with LLMs natively inside databricks notebooks.
Software design principles for machine learning applications
Baseline for Databricks Labs projects written in Python
200+ detailed flashcards useful for reviewing topics in machine learning, computer vision, and computer science.
🦖 𝗟𝗲𝗮𝗿𝗻 about 𝗟𝗟𝗠𝘀, 𝗟𝗟𝗠𝗢𝗽𝘀, and 𝘃𝗲𝗰𝘁𝗼𝗿 𝗗𝗕𝘀 for free by designing, training, and deploying a real-time financial advisor LLM system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 𝘷𝘪𝘥𝘦𝘰 & 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴
pyspark methods to enhance developer productivity 📣 👯 🎉
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collec…
nannyml: post-deployment data science in python
Examples and guides for using the OpenAI API
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
Warp is a modern, Rust-based terminal with AI built in so you and your team can build great software, faster.
Source code for Twitter's Recommendation Algorithm
Source code for Twitter's Recommendation Algorithm
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform