Starred repositories
Light-weight, browser-based ROLAP pivot tables on top of DuckDB-WASM
pyspark methods to enhance developer productivity 📣 👯 🎉
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…
PySpark test helper methods with beautiful error messages
Interact with your SQL database, Natural Language to SQL using LLMs
Apache Wayang(incubating) is the first cross-platform data processing system.
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
Glamorous Toolkit is the Moldable Development Environment. It empowers you to make systems explainable through contextual micro tools.
A whitespace formatter for different query languages
Uses tokenized query returned by python-sqlparse and generates query metadata
Achieving confident refactoring through experimentation with Python 2.7 & 3.3+
SQL Lineage Analysis Tool powered by Python
Data Lineage Tracking And Visualization Solution
🐳 Tool to automate data quality checks on data pipelines
Compare tables within or across databases
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
Relational database persistence for Pharo objects
The framework for developing sophisticated web applications in Smalltalk.
📖 An approachable introduction to Assembly.
Interactive tool to visualize table details & foreign key relationships in SQL databases in Glamorous Toolkit
snowflake procs to build and rollover date dimension in the datawarehouse
Source files for A. Jesse Jiryu Davis's PyCon 2022 talk