.NET for Apache Spark™ Tutorial - Get started in 10 minutes

Install .NET for Apache Spark

Download the Microsoft.Spark.Worker release from the .NET for Apache Spark GitHub repository:

Download .NET for Apache Spark (v1.0.0)

Extract the Microsoft.Spark.Worker

  1. Locate the Microsoft.Spark.Worker.netcoreapp3.1.win-x64-1.0.0.zip file that you just downloaded.
  2. Right-click and select 7-Zip > Extract files.
  3. Enter C:\bin in the Extract to field.
  4. Uncheck the checkbox below the Extract to field.
  5. Select the OK button.

Install WinUtils

.NET for Apache Spark requires WinUtils to be installed alongside Apache Spark.

Download Winutils

Once winutils.exe downloads, copy it into C:\bin\spark-3.0.1-bin-hadoop2.7\bin.

Set DOTNET_WORKER_DIR

Run the following command to set the DOTNET_WORKER_DIR environment variable. This is used by .NET apps to locate .NET for Apache Spark.

Command prompt
setx DOTNET_WORKER_DIR "C:\bin\Microsoft.Spark.Worker-1.0.0"

Finally, double check that you can run spark-shell from your command line before you move to the next section. Press CTRL+D to to quit Spark.

Download .NET for Apache Spark

.NET for Apache Spark is downloaded as a .tgz file.

Download .NET for Apache Spark (v1.0.0)

Extract the Microsoft.Spark.Worker:

In your terminal, move to the folder that contains the file you just downloaded then run the following command:

Terminal
tar xvf Microsoft.Spark.Worker.netcoreapp3.1.linux-x64-1.0.0.tar.gz --directory ~/bin

Set DOTNET_WORKER_DIR

Run the following command to set the DOTNET_WORKER_DIR environment variable. This is used by .NET apps to locate .NET for Apache Spark.

Command prompt
export DOTNET_WORKER_DIR="~/bin/Microsoft.Spark.Worker-1.0.0"
Continue