org.apache.sysml.runtime.instructions.gpu.context

Class JCudaObject.CSRPointer

  • java.lang.Object
    • org.apache.sysml.runtime.instructions.gpu.context.JCudaObject.CSRPointer
  • Enclosing class:
    JCudaObject


    public static class JCudaObject.CSRPointer
    extends Object
    Compressed Sparse Row (CSR) format for CUDA Generalized matrix multiply is implemented for CSR format in the cuSparse library among other operations
    • Field Detail

      • matrixDescriptor

        public static jcuda.jcusparse.cusparseMatDescr matrixDescriptor
      • nnz

        public long nnz
        Number of non zeroes
      • val

        public jcuda.Pointer val
        double array of non zero values
      • rowPtr

        public jcuda.Pointer rowPtr
        integer array of start of all rows and end of last row + 1
      • colInd

        public jcuda.Pointer colInd
        integer array of nnz values' column indices
      • descr

        public jcuda.jcusparse.cusparseMatDescr descr
        descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
    • Method Detail

      • getDefaultCuSparseMatrixDescriptor

        public static jcuda.jcusparse.cusparseMatDescr getDefaultCuSparseMatrixDescriptor()
        Returns:
        Singleton default matrix descriptor object (set with CUSPARSE_MATRIX_TYPE_GENERAL, CUSPARSE_INDEX_BASE_ZERO)
      • isUltraSparse

        public boolean isUltraSparse(int rows,
                                     int cols)
        Check for ultra sparsity
        Parameters:
        rows - number of rows
        cols - number of columns
        Returns:
        true if ultra sparse
      • estimateSize

        public static long estimateSize(long nnz2,
                                        long rows)
        Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added in
        Parameters:
        nnz2 - number of non zeroes
        rows - number of rows
        Returns:
        size estimate
      • copyToDevice

        public static void copyToDevice(JCudaObject.CSRPointer dest,
                                        int rows,
                                        long nnz,
                                        int[] rowPtr,
                                        int[] colInd,
                                        double[] values)
        Static method to copy a CSR sparse matrix from Host to Device
        Parameters:
        dest - [input] destination location (on GPU)
        rows - number of rows
        nnz - number of non-zeroes
        rowPtr - integer array of row pointers
        colInd - integer array of column indices
        values - double array of non zero values
      • copyToHost

        public static void copyToHost(JCudaObject.CSRPointer src,
                                      int rows,
                                      long nnz,
                                      int[] rowPtr,
                                      int[] colInd,
                                      double[] values)
        Static method to copy a CSR sparse matrix from Device to host
        Parameters:
        src - [input] source location (on GPU)
        rows - [input] number of rows
        nnz - [input] number of non-zeroes
        rowPtr - [output] pre-allocated integer array of row pointers of size (rows+1)
        colInd - [output] pre-allocated integer array of column indices of size nnz
        values - [output] pre-allocated double array of values of size nnz
      • allocateForDgeam

        public static JCudaObject.CSRPointer allocateForDgeam(jcuda.jcusparse.cusparseHandle handle,
                                                              JCudaObject.CSRPointer A,
                                                              JCudaObject.CSRPointer B,
                                                              int m,
                                                              int n)
                                                       throws DMLRuntimeException
        Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)
        Parameters:
        handle - a valid cusparseHandle
        A - Sparse Matrix A on GPU
        B - Sparse Matrix B on GPU
        m - Rows in A
        n - Columns in Bs
        Returns:
        CSR (compressed sparse row) pointer
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • allocateForMatrixMultiply

        public static JCudaObject.CSRPointer allocateForMatrixMultiply(jcuda.jcusparse.cusparseHandle handle,
                                                                       JCudaObject.CSRPointer A,
                                                                       int transA,
                                                                       JCudaObject.CSRPointer B,
                                                                       int transB,
                                                                       int m,
                                                                       int n,
                                                                       int k)
                                                                throws DMLRuntimeException
        Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns the JCudaObject.CSRPointer to C with the appropriate GPU memory.
        Parameters:
        handle - a valid cusparseHandle
        A - Sparse Matrix A on GPU
        transA - 'T' if A is to be transposed, 'N' otherwise
        B - Sparse Matrix B on GPU
        transB - 'T' if B is to be transposed, 'N' otherwise
        m - Rows in A
        n - Columns in B
        k - Columns in A / Rows in B
        Returns:
        a JCudaObject.CSRPointer instance that encapsulates the CSR matrix on GPU
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • toColumnMajorDenseMatrix

        public jcuda.Pointer toColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle,
                                                      jcuda.jcublas.cublasHandle cublasHandle,
                                                      int rows,
                                                      int cols)
                                               throws DMLRuntimeException
        Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU. This is a temporary matrix for operations such as cusparseDcsrmv. Since the allocated matrix is temporary, bookkeeping is not updated. The caller is responsible for calling "free" on the returned Pointer object
        Parameters:
        cusparseHandle - a valid cusparseHandle
        cublasHandle - a valid cublasHandle
        rows - number of rows in this CSR matrix
        cols - number of columns in this CSR matrix
        Returns:
        A Pointer to the allocated dense matrix (in column-major format)
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • deallocate

        public void deallocate()
        Calls cudaFree lazily on the allocated Pointer instances
      • deallocate

        public void deallocate(boolean eager)
        Calls cudaFree lazily or eagerly on the allocated Pointer instances
        Parameters:
        eager - whether to do eager or lazy cudaFrees

Copyright © 2017 The Apache Software Foundation. All rights reserved.