Apache Spark docker image
-
Updated
Apr 21, 2023 - Shell
8000
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Apache Spark docker image
A curated list of awesome Apache Spark packages and resources.
[PROJECT IS NO LONGER MAINTAINED] Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Ansible roles to install an Spark Standalone cluster (HDFS/Spark/Jupyter Notebook) or Ambari based Spark cluster
Easy CPU Profiling for Apache Spark applications
A .NET for Apache Spark docker image (3rdman/dotnet-spark)
Driver/Executor images for spark-operator
An image for running Jupyter notebooks and Apache Spark in the cloud on OpenShift
Production run of Apache Spark on Kubernetes
The implementation of Apache Spark (combine with PySpark, Jupyter Notebook) on top of Hadoop cluster using Docker
demo of running apache spark jobs using tekton and s2i workflows
Sample Oozie Workflow to test the Spark Job. In Workflow, we use the Shell action to call a Shell script. The Shell script will be invoking the Spark Pi example Job.
Sparkler Crawl Environment - a packaged, dockerized version of http://github.com/USCDataScience/sparkler.git
Create n-node cluster and Run spark job on Docker
Host files and procedure for running Fink on Kubernetes
This is the material for the 2019 Silicon Valley Code Camp Session "Realish Time Predictive Analytics with Spark Structured Streaming"
Apache Spark cluster with docker-swarm prometheus cadvisor
Created by Matei Zaharia
Released May 26, 2014