etl
⚡ Fetching and realtime data exchange framework.
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 600+ plugins. Alternative to Airflow, n8n, Rundeck, VMware vRA, Zapier ...
Easily setup logical replication and switchover to new database with minimal downtime
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
🦎 A multi-protocol edge & service proxy. Seamlessly interface web apps, IoT clients, & microservices to Apache Kafka® via declaratively defined, stateless APIs.
Upserts, Deletes And Incremental Processing on Big Data.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
High-performance, low-footprint SQL database written in C++. Process millions of rows per second from Kafka/Pulsar, Iceberg, or ClickHouse, and seamlessly write results back. Supports powerful feat…
An open-source, low-code machine learning library in Python
chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
A portable accelerated data query and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
A flexible distributed key-value database that is optimized for caching and other realtime workloads.
ParadeDB is a modern Elasticsearch alternative built on Postgres. Built for real-time, update-heavy workloads.
A Python framework for defining and querying BI models in your data warehouse
Trench — Open-Source Analytics Infrastructure. A single production-ready Docker image built on ClickHouse, Kafka, and Node.js for tracking events, page views. Easily build product analytics dashboa…