Stars
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
A MULTI-GENERATOR ENSEMBLE FRAMEWORK FOR NATURAL LANGUAGE TO SQL
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
SuperSonic is the next-generation AI+BI platform that unifies Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
🦜🔗 Build context-aware reasoning applications
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Apache Spark to Apache Cassandra connector
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
The official home of the Presto distributed SQL query engine for big data
Apache Spark Connector for SQL Server and Azure SQL
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Upserts, Deletes And Incremental Processing on Big Data.
A project provide a shell to talk to ratis server
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
A composable and fully extensible C++ execution engine library for data management systems.
Set of Kubernetes solutions for reusing idle resources of nodes by running extra batch jobs
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
2023年最新总结,阿里,腾讯,百度,美团,头条等技术面试题目,以及答案,专家出题人分析汇总。
Apache Spark - A unified analytics engine for large-scale data processing