Install SystemML

1Pre-requisite

Apache Spark 2.x

Set SPARK_HOME to a location where Spark 2.x is installed.

2Setup

1) Install SystemML:
pip install systemml
2) For more information, please see the SystemML project documentation:
http://systemml.apache.org/docs/0.15.0/index.html
http://systemml.apache.org/docs/0.15.0/beginners-guide-python
1) Download Apache SystemML binary release (tgz or zip):
http://www.apache.org/dyn/closer.lua/systemml/0.15.0/systemml-0.15.0-bin.tgz
2) Extract binary release contents:
tar -xvzf systemml-0.15.0-bin.tgz
3) Go to project root directory:
cd systemml-0.15.0-bin
4) Start Spark Shell with SystemML jar file:
spark-shell --executor-memory 4G --driver-memory 4G --jars lib/systemml-0.15.0.jar
5) You're all set to run SystemML on Spark:
import org.apache.sysml.api.mlcontext._
import org.apache.sysml.api.mlcontext.ScriptFactory._
val ml = new MLContext(spark)
val helloScript = dml("print('hello world')")
ml.execute(helloScript)
6) For more information, please see the SystemML project documentation:
http://systemml.apache.org/docs/0.15.0/index.html
http://systemml.apache.org/docs/0.15.0/spark-mlcontext-programming-guide
1) Install python development build of SystemML:
pip install https://sparktc.ibmcloud.com/repo/latest/systemml-1.0.0-SNAPSHOT-python.tar.gz
1) Download binary development build of SystemML (tgz or zip):
https://sparktc.ibmcloud.com/repo/latest/systemml-1.0.0-SNAPSHOT-bin.tgz
2) See further steps on Scala tab.

3Configure Jupyter Notebook (Optional)

# Start Jupyter Notebook Server
PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark --master local[*] --conf "spark.driver.memory=12g" --conf spark.driver.maxResultSize=0 --conf spark.default.parallelism=100
		

1) Toree Kernel Setup (Required for Scala Kernel)

1.1) Toree Installation:
For detailed instructions, visit https://github.com/apache/incubator-toree.
pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz
1.2) Installation of Toree in Jupyter:
For detailed instructions, visit https://toree.apache.org/docs/current/user/installation.
jupyter toree install —-replace —-interpreters=Scala,PySpark --spark_opts="--master=local --jars <SystemML JAR File>” --spark_home=${SPARK_HOME}

2) Start Jupyter Notebook Server

jupyter notebook

This will start a default browser with contents from the directory where the above command was run. You can create your own notebook or download sample notebooks from the SystemML GitHub repository at https://github.com/apache/systemml/tree/master/samples/jupyter-notebooks.

Start Jupyter Notebook Server
Start Jupyter Notebook Server