Run Spark on Ray
Run Spark on Ray
This example demonstrates how to run a simple data processing example with RayDP, a library for running Spark on Ray.
Install the Anyscale CLI
pip install -U anyscale
anyscale login
Submit the job.
Clone the example from GitHub.
git clone https://github.com/anyscale/examples.git
cd examples/spark_on_ray
Submit the job.
anyscale job submit -f job.yaml
Understanding the example
- This example is extremely simple and just uses basic Spark APIs. More configuration is required to read from blob stores like S3.