org.apache.sysml.runtime.instructions.spark.utils

Class SparkUtils



  • public class SparkUtils
    extends Object
    • Field Detail

      • DEFAULT_TMP

        public static final org.apache.spark.storage.StorageLevel DEFAULT_TMP
    • Constructor Detail

      • SparkUtils

        public SparkUtils()
    • Method Detail

      • toIndexedMatrixBlock

        public static org.apache.sysml.runtime.matrix.mapred.IndexedMatrixValue toIndexedMatrixBlock(scala.Tuple2<MatrixIndexes,MatrixBlock> in)
      • toIndexedMatrixBlock

        public static org.apache.sysml.runtime.matrix.mapred.IndexedMatrixValue toIndexedMatrixBlock(MatrixIndexes ix,
                                                                                                     MatrixBlock mb)
      • fromIndexedMatrixBlock

        public static scala.Tuple2<MatrixIndexes,MatrixBlock> fromIndexedMatrixBlock(org.apache.sysml.runtime.matrix.mapred.IndexedMatrixValue in)
      • fromIndexedMatrixBlockToPair

        public static Pair<MatrixIndexes,MatrixBlock> fromIndexedMatrixBlockToPair(org.apache.sysml.runtime.matrix.mapred.IndexedMatrixValue in)
      • isHashPartitioned

        public static boolean isHashPartitioned(org.apache.spark.api.java.JavaPairRDD<?,?> in)
        Indicates if the input RDD is hash partitioned, i.e., it has a partitioner of type org.apache.spark.HashPartitioner.
        Parameters:
        in - input JavaPairRDD
        Returns:
        true if input is hash partitioned
      • getNumPreferredPartitions

        public static int getNumPreferredPartitions(MatrixCharacteristics mc,
                                                    org.apache.spark.api.java.JavaPairRDD<?,?> in)
      • copyBinaryBlockMatrix

        public static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> copyBinaryBlockMatrix(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> in)
        Creates a partitioning-preserving deep copy of the input matrix RDD, where the indexes and values are copied.
        Parameters:
        in - matrix as JavaPairRDD<MatrixIndexes,MatrixBlock>
        Returns:
        matrix as JavaPairRDD<MatrixIndexes,MatrixBlock>
      • copyBinaryBlockMatrix

        public static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> copyBinaryBlockMatrix(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> in,
                                                                                                             boolean deep)
        Creates a partitioning-preserving copy of the input matrix RDD. If a deep copy is requested, indexes and values are copied, otherwise they are simply passed through.
        Parameters:
        in - matrix as JavaPairRDD<MatrixIndexes,MatrixBlock>
        deep - if true, perform deep copy
        Returns:
        matrix as JavaPairRDD<MatrixIndexes,MatrixBlock>
      • getPrefixFromSparkDebugInfo

        public static String getPrefixFromSparkDebugInfo(String line)
      • getEmptyBlockRDD

        public static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> getEmptyBlockRDD(org.apache.spark.api.java.JavaSparkContext sc,
                                                                                                        MatrixCharacteristics mc)
        Creates an RDD of empty blocks according to the given matrix characteristics. This is done in a scalable manner by parallelizing block ranges and generating empty blocks in a distributed manner, under awareness of preferred output partition sizes.
        Parameters:
        sc - spark context
        mc - matrix characteristics
        Returns:
        pair rdd of empty matrix blocks
      • computeMatrixCharacteristics

        public static MatrixCharacteristics computeMatrixCharacteristics(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixCell> input)
        Utility to compute dimensions and non-zeros in a given RDD of binary cells.
        Parameters:
        input - matrix as JavaPairRDD<MatrixIndexes, MatrixCell>
        Returns:
        matrix characteristics

Copyright © 2017 The Apache Software Foundation. All rights reserved.