Stars
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
Implementation of a Kafka Cleansing, Validation, Enrichment service based on KStreams.
Fabric toolbox is a repository of tools, accelerators, scripts, and samples to accelerate your success with Microsoft Fabric, brought to you by Fabric CAT.
Baseline for Databricks Labs projects written in Python
Official repository for the book Time Series Forecasting with Foundation Models
An SDK for working with LLMs and AI Agents from Apache Airflow, based on Pydantic AI
You are free to use this to generate data for testing. Attribution to in/JosueBogran for the dataset with a link to the generator is appreciated!
Fabric Python Notebooks examples
Code and Slides
Data for custom Contoso database generator V2
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
This is a repo with links to everything you'd ever want to learn about data engineering
⚡️Wren AI is your GenBI Agent, that you can query any database with natural language → get accurate SQL(Text-to-SQL), charts(Text-to-Charts) & AI-generated insights in seconds.
This repository contains the Hugging Face Agents Course.
A sophisticated exploration of dbt macro capabilities, pushing the boundaries of what's possible with dbt's macro system.
Minimal reproduction of DeepSeek R1-Zero
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
Local Environment to Practice Data Engineering
Source code and data for the paper "SALT: Sales Autocompletion Linked Business Tables Dataset"
Implementation of the deep learning models with training and evaluation pipelines described in the paper "PORTAL: Scalable Tabular Foundation Models via Content-Specific Tokenization"