Stars
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
Service for automatically managing and cleaning up unreferenced data
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
A simple library for implementing common design patterns.
The Open Source Observability Distribution
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Roadmap to becoming a data engineer in 2021
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Move and resize windows on macOS with keyboard shortcuts and snap areas
Source code from Jake Wright's YouTube tutorials
Safety checks Python dependencies for known security vulnerabilities and suggests the proper remediations for vulnerabilities detected.
Example of Docker configuration for an entry at the BBVA Data & Analytics blog
Example Play Scala application showing REST API
⭐️ Companies that don't have a broken hiring process
🙏 There are quite a few religions but none of them has a deity as cool as ours!
Curated list of project-based tutorials
Python script converts XML to JSON or the other way around
Just a place to track issues and feature requests that I have for github
🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaSc…
An Open Source Machine Learning Framework for Everyone
Bridge between Slack and IRC channels allowing message filtering and logging while keeping communication public