org.apache.sysml.runtime.matrix.data

Class LibMatrixMult



  • public class LibMatrixMult
    extends Object
    MB: Library for matrix multiplications including MM, MV, VV for all combinations of dense, sparse, ultrasparse representations and special operations such as transpose-self matrix multiplication. In general all implementations use internally dense outputs for direct access, but change the final result to sparse if necessary. The only exceptions are ultra-sparse matrix mult, wsloss and wsigmoid. NOTES on BLAS: * Experiments in 04/2013 showed that even on dense-dense this implementation is 3x faster than f2j-BLAS-DGEMM, 2x faster than f2c-BLAS-DGEMM, and level (+10% after JIT) with a native C implementation. * Calling native BLAS would loose platform independence and would require JNI calls incl data transfer. Furthermore, BLAS does not support sparse matrices (except Sparse BLAS, with dedicated function calls and matrix formats) and would be an external dependency. * Experiments in 02/2014 showed that on dense-dense this implementation now achieves almost 30% peak FP performance. Compared to Intel MKL 11.1 (dgemm, N=1000) it is just 3.2x (sparsity=1.0) and 1.9x (sparsity=0.5) slower, respectively.
    • Method Detail

      • matrixMult

        public static void matrixMult(MatrixBlock m1,
                                      MatrixBlock m2,
                                      MatrixBlock ret)
                               throws DMLRuntimeException
        Performs a matrix multiplication and stores the result in the output matrix. All variants use a IKJ access pattern, and internally use dense output. After the actual computation, we recompute nnz and check for sparse/dense representation.
        Parameters:
        m1 - first matrix
        m2 - second matrix
        ret - result matrix
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • matrixMult

        public static void matrixMult(MatrixBlock m1,
                                      MatrixBlock m2,
                                      MatrixBlock ret,
                                      boolean examSparsity)
                               throws DMLRuntimeException
        This method allows one to disabling exam sparsity. This feature is useful if matrixMult is used as an intermediate operation (for example: LibMatrixDNN). It makes sense for LibMatrixDNN because the output is internally consumed by another dense instruction, which makes repeated conversion to sparse wasteful. This should be used in rare cases and if you are unsure, use the method 'matrixMult(MatrixBlock m1, MatrixBlock m2, MatrixBlock ret)' instead.
        Parameters:
        m1 - first matrix
        m2 - second matrix
        ret - result matrix
        examSparsity - if false, sparsity examination is disabled
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • matrixMult

        public static void matrixMult(MatrixBlock m1,
                                      MatrixBlock m2,
                                      MatrixBlock ret,
                                      int k)
                               throws DMLRuntimeException
        Performs a multi-threaded matrix multiplication and stores the result in the output matrix. The parameter k (k>=1) determines the max parallelism k' with k'=min(k, vcores, m1.rlen).
        Parameters:
        m1 - first matrix
        m2 - second matrix
        ret - result matrix
        k - maximum parallelism
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • matrixMultChain

        public static void matrixMultChain(MatrixBlock mX,
                                           MatrixBlock mV,
                                           MatrixBlock mW,
                                           MatrixBlock ret,
                                           org.apache.sysml.lops.MapMultChain.ChainType ct)
                                    throws DMLRuntimeException
        Performs a matrix multiplication chain operation of type t(X)%*%(X%*%v) or t(X)%*%(w*(X%*%v)). All variants use a IKJ access pattern, and internally use dense output. After the actual computation, we recompute nnz and check for sparse/dense representation.
        Parameters:
        mX - X matrix
        mV - v matrix
        mW - w matrix
        ret - result matrix
        ct - chain type
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • matrixMultChain

        public static void matrixMultChain(MatrixBlock mX,
                                           MatrixBlock mV,
                                           MatrixBlock mW,
                                           MatrixBlock ret,
                                           org.apache.sysml.lops.MapMultChain.ChainType ct,
                                           int k)
                                    throws DMLRuntimeException
        Performs a parallel matrix multiplication chain operation of type t(X)%*%(X%*%v) or t(X)%*%(w*(X%*%v)). The parameter k (k>=1) determines the max parallelism k' with k'=min(k, vcores, m1.rlen). NOTE: This multi-threaded mmchain operation has additional memory requirements of k*ncol(X)*8bytes for partial aggregation. Current max memory: 256KB; otherwise redirectly to sequential execution.
        Parameters:
        mX - X matrix
        mV - v matrix
        mW - w matrix
        ret - result matrix
        ct - chain type
        k - maximum parallelism
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • matrixMultWDivMM

        public static void matrixMultWDivMM(MatrixBlock mW,
                                            MatrixBlock mU,
                                            MatrixBlock mV,
                                            MatrixBlock mX,
                                            MatrixBlock ret,
                                            org.apache.sysml.lops.WeightedDivMM.WDivMMType wt)
                                     throws DMLRuntimeException
        NOTE: This operation has limited NaN support, which is acceptable because all our sparse-safe operations have only limited NaN support. If this is not intended behavior, please disable the rewrite. In detail, this operator will produce for W/(U%*%t(V)) a zero intermediate for each zero in W (even if UVij is zero which would give 0/0=NaN) but INF/-INF for non-zero entries in V where the corresponding cell in (Y%*%X) is zero.
        Parameters:
        mW - matrix W
        mU - matrix U
        mV - matrix V
        mX - matrix X
        ret - result type
        wt - weighted divide matrix multiplication type
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • matrixMultWDivMM

        public static void matrixMultWDivMM(MatrixBlock mW,
                                            MatrixBlock mU,
                                            MatrixBlock mV,
                                            MatrixBlock mX,
                                            MatrixBlock ret,
                                            org.apache.sysml.lops.WeightedDivMM.WDivMMType wt,
                                            int k)
                                     throws DMLRuntimeException
        NOTE: This operation has limited NaN support, which is acceptable because all our sparse-safe operations have only limited NaN support. If this is not intended behavior, please disable the rewrite. In detail, this operator will produce for W/(U%*%t(V)) a zero intermediate for each zero in W (even if UVij is zero which would give 0/0=NaN) but INF/-INF for non-zero entries in V where the corresponding cell in (Y%*%X) is zero.
        Parameters:
        mW - matrix W
        mU - matrix U
        mV - matrix V
        mX - matrix X
        ret - result matrix
        wt - weighted divide matrix multiplication type
        k - maximum parallelism
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • dotProduct

        public static double dotProduct(double[] a,
                                        double[] b,
                                        int ai,
                                        int bi,
                                        int len)
      • dotProduct

        public static double dotProduct(double[] a,
                                        double[] b,
                                        int[] aix,
                                        int ai,
                                        int bi,
                                        int len)
      • vectMultiplyAdd

        public static void vectMultiplyAdd(double aval,
                                           double[] b,
                                           double[] c,
                                           int bi,
                                           int ci,
                                           int len)
      • vectMultiplyAdd

        public static void vectMultiplyAdd(double aval,
                                           double[] b,
                                           double[] c,
                                           int[] bix,
                                           int bi,
                                           int ci,
                                           int len)
      • vectMultiplyWrite

        public static void vectMultiplyWrite(double aval,
                                             double[] b,
                                             double[] c,
                                             int bi,
                                             int ci,
                                             int len)
      • vectAdd

        public static void vectAdd(double[] a,
                                   double[] c,
                                   int ai,
                                   int ci,
                                   int len)
      • copyUpperToLowerTriangle

        public static void copyUpperToLowerTriangle(MatrixBlock ret)
        Used for all version of TSMM where the result is known to be symmetric. Hence, we compute only the upper triangular matrix and copy this partial result down to lower triangular matrix once.
        Parameters:
        ret - matrix

Copyright © 2017 The Apache Software Foundation. All rights reserved.