Install SystemML


Apache Spark 2.x

Set SPARK_HOME to a location where Spark 2.x is installed.


1) Install SystemML:
pip install systemml
2) For more information, please see the SystemML project documentation:
1) Download Apache SystemML binary release (tgz or zip):
2) Extract binary release contents:
tar -xvzf systemml-1.0.0-bin.tgz
3) Go to project root directory:
cd systemml-1.0.0-bin
4) Start Spark Shell with SystemML jar file:
spark-shell --executor-memory 4G --driver-memory 4G --jars lib/systemml-1.0.0.jar
5) You're all set to run SystemML on Spark:
import org.apache.sysml.api.mlcontext._
import org.apache.sysml.api.mlcontext.ScriptFactory._
val ml = new MLContext(spark)
val helloScript = dml("print('hello world')")
6) For more information, please see the SystemML project documentation:
1) Install python development build of SystemML:
pip install
1) Download binary development build of SystemML (tgz or zip):
2) See further steps on Scala tab.

3Configure Jupyter Notebook (Optional)

# Start Jupyter Notebook Server
PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark --master local[*] --conf "spark.driver.memory=12g" --conf spark.driver.maxResultSize=0 --conf spark.default.parallelism=100

1) Toree Kernel Setup (Required for Scala Kernel)

1.1) Toree Installation:
For detailed instructions, visit
pip install
1.2) Installation of Toree in Jupyter:
For detailed instructions, visit
jupyter toree install —-replace —-interpreters=Scala,PySpark --spark_opts="--master=local --jars <SystemML JAR File>” --spark_home=${SPARK_HOME}

2) Start Jupyter Notebook Server

jupyter notebook

This will start a default browser with contents from the directory where the above command was run. You can create your own notebook or download sample notebooks from the SystemML GitHub repository at

Start Jupyter Notebook Server
Start Jupyter Notebook Server