Stars
Apache DataFusion Comet Spark Accelerator
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
A composable and fully extensible C++ execution engine library for data management systems.
Upserts, Deletes And Incremental Processing on Big Data.
Janino is a super-small, super-fast Java™ compiler.
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
Spark ClickHouse Connector build on DataSourceV2 API
Flowchart for debugging Spark applications
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
A tool for monitoring and tuning Spark jobs for efficiency.
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Now we have become very big, Different from the original idea. Collect premium software in various categories.
Apache Spark - A unified analytics engine for large-scale data processing
macOS command-line utility to limit max battery charge