-
-
celeborn Public
Forked from apache/celebornApache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Java Apache License 2.0 UpdatedApr 16, 2025 -
-
spark Public
Forked from apache/sparkApache Spark - A unified analytics engine for large-scale data processing
-
hudi Public
Forked from apache/hudiUpserts, Deletes And Incremental Processing on Big Data.
Java Apache License 2.0 UpdatedSep 20, 2024 -
velox Public
Forked from facebookincubator/veloxA C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
C++ Apache License 2.0 UpdatedApr 25, 2024 -
incubator-kyuubi Public
Forked from apache/kyuubiApache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
-
-
netty Public
Forked from netty/nettyNetty project - an event-driven asynchronous network application framework
Java Apache License 2.0 UpdatedNov 17, 2023 -
compass Public
Forked from cubefs/compassCompass is a task diagnosis platform for bigdata
Java Apache License 2.0 UpdatedAug 24, 2023 -
-
-
spark-rapids Public
Forked from NVIDIA/spark-rapidsSpark RAPIDS plugin - accelerate Apache Spark with GPUs
Scala Apache License 2.0 UpdatedJan 9, 2023 -
Alink Public
Forked from alibaba/AlinkAlink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Java Apache License 2.0 UpdatedMay 10, 2022 -
incubator-seatunnel Public
Forked from apache/seatunnelSeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Java Apache License 2.0 UpdatedMay 5, 2022 -
ClickHouse-Native-JDBC Public
Forked from housepower/ClickHouse-Native-JDBCClickHouse Native Protocol JDBC implementation
Java Apache License 2.0 UpdatedMar 6, 2022 -
spark-clickhouse-connector Public
Forked from ClickHouse/spark-clickhouse-connectorSpark ClickHouse Connector build on DataSourceV2 API and gRPC protocol.
Scala Apache License 2.0 UpdatedMar 6, 2022 -
spark-tfrecord Public
Forked from linkedin/spark-tfrecordRead and write Tensorflow TFRecord data from Apache Spark.
-
Firestorm Public
Forked from Tencent/FirestormFirestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote servers
Java Other UpdatedNov 2, 2021 -
hue Public
Forked from cloudera/hueOpen source SQL Query Assistant service for Databases/Warehouses
Python Apache License 2.0 UpdatedOct 26, 2021 -
orc Public
Forked from apache/orcApache ORC - the smallest, fastest columnar storage for Hadoop workloads
HTML Apache License 2.0 UpdatedSep 3, 2021 -
RemoteShuffleService Public
Forked from uber/RemoteShuffleServiceRemote shuffle service for Apache Spark to store shuffle data on remote servers.
Java Other UpdatedJul 27, 2021 -