org.apache.sysml.api

Class MLMatrix

  • java.lang.Object
    • org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
      • org.apache.sysml.api.MLMatrix
  • All Implemented Interfaces:
    Serializable

    Deprecated. 
    This will be removed in SystemML 1.0. Please migrate to MLContext

    @Deprecated
    public class MLMatrix
    extends org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
    Experimental API: Might be discontinued in future release This class serves four purposes: 1. It allows SystemML to fit nicely in MLPipeline by reducing number of reblocks. 2. It allows users to easily read and write matrices without worrying too much about format, metadata and type of underlying RDDs. 3. It provides mechanism to convert to and from MLLib's BlockedMatrix format 4. It provides off-the-shelf library for Distributed Blocked Matrix and reduces learning curve for using SystemML. However, it is important to know that it is easy to abuse this off-the-shelf library and think it as replacement to writing DML, which it is not. It does not provide any optimization between calls. A simple example of the optimization that is conveniently skipped is: (t(m) %*% m)). Also, note that this library is not thread-safe. The operator precedence is not exactly same as DML (as the precedence is enforced by scala compiler), so please use appropriate brackets to enforce precedence. import org.apache.sysml.api.{MLContext, MLMatrix} val ml = new MLContext(sc) val mat1 = ml.read(sparkSession, "V_small.csv", "csv") val mat2 = ml.read(sparkSession, "W_small.mtx", "binary") val result = mat1.transpose() %*% mat2 result.write("Result_small.mtx", "text")
    See Also:
    Serialized Form

Copyright © 2017 The Apache Software Foundation. All rights reserved.