org.apache.sysml.runtime.instructions.gpu.context

Class CSRPointer



  • public class CSRPointer
    extends Object
    Compressed Sparse Row (CSR) format for CUDA Generalized matrix multiply is implemented for CSR format in the cuSparse library among other operations Since we assume that the matrix is stored with zero-based indexing (i.e. CUSPARSE_INDEX_BASE_ZERO), the matrix 1.0 4.0 0.0 0.0 0.0 0.0 2.0 3.0 0.0 0.0 5.0 0.0 0.0 7.0 8.0 0.0 0.0 9.0 0.0 6.0 is stored as val = 1.0 4.0 2.0 3.0 5.0 7.0 8.0 9.0 6.0 rowPtr = 0.0 2.0 4.0 7.0 9.0 colInd = 0.0 1.0 1.0 2.0 0.0 3.0 4.0 2.0 4.0
    • Field Summary

      Fields 
      Modifier and Type Field and Description
      jcuda.Pointer colInd
      integer array of nnz values' column indices
      jcuda.jcusparse.cusparseMatDescr descr
      descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
      static jcuda.jcusparse.cusparseMatDescr matrixDescriptor 
      long nnz
      Number of non zeroes
      jcuda.Pointer rowPtr
      integer array of start of all rows and end of last row + 1
      jcuda.Pointer val
      double array of non zero values
    • Field Detail

      • matrixDescriptor

        public static jcuda.jcusparse.cusparseMatDescr matrixDescriptor
      • nnz

        public long nnz
        Number of non zeroes
      • val

        public jcuda.Pointer val
        double array of non zero values
      • rowPtr

        public jcuda.Pointer rowPtr
        integer array of start of all rows and end of last row + 1
      • colInd

        public jcuda.Pointer colInd
        integer array of nnz values' column indices
      • descr

        public jcuda.jcusparse.cusparseMatDescr descr
        descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
    • Method Detail

      • getDefaultCuSparseMatrixDescriptor

        public static jcuda.jcusparse.cusparseMatDescr getDefaultCuSparseMatrixDescriptor()
        Returns:
        Singleton default matrix descriptor object (set with CUSPARSE_MATRIX_TYPE_GENERAL, CUSPARSE_INDEX_BASE_ZERO)
      • estimateSize

        public static long estimateSize(long nnz2,
                                        long rows)
        Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added in
        Parameters:
        nnz2 - number of non zeroes
        rows - number of rows
        Returns:
        size estimate
      • copyToDevice

        public static void copyToDevice(CSRPointer dest,
                                        int rows,
                                        long nnz,
                                        int[] rowPtr,
                                        int[] colInd,
                                        double[] values)
                                 throws DMLRuntimeException
        Static method to copy a CSR sparse matrix from Host to Device
        Parameters:
        dest - [input] destination location (on GPU)
        rows - number of rows
        nnz - number of non-zeroes
        rowPtr - integer array of row pointers
        colInd - integer array of column indices
        values - double array of non zero values
        Throws:
        DMLRuntimeException - if error occurs
      • copyToHost

        public static void copyToHost(CSRPointer src,
                                      int rows,
                                      long nnz,
                                      int[] rowPtr,
                                      int[] colInd,
                                      double[] values)
        Static method to copy a CSR sparse matrix from Device to host
        Parameters:
        src - [input] source location (on GPU)
        rows - [input] number of rows
        nnz - [input] number of non-zeroes
        rowPtr - [output] pre-allocated integer array of row pointers of size (rows+1)
        colInd - [output] pre-allocated integer array of column indices of size nnz
        values - [output] pre-allocated double array of values of size nnz
      • allocateForDgeam

        public static CSRPointer allocateForDgeam(GPUContext gCtx,
                                                  jcuda.jcusparse.cusparseHandle handle,
                                                  CSRPointer A,
                                                  CSRPointer B,
                                                  int m,
                                                  int n)
                                           throws DMLRuntimeException
        Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)
        Parameters:
        gCtx - a valid GPUContext
        handle - a valid cusparseHandle
        A - Sparse Matrix A on GPU
        B - Sparse Matrix B on GPU
        m - Rows in A
        n - Columns in Bs
        Returns:
        CSR (compressed sparse row) pointer
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • allocateForMatrixMultiply

        public static CSRPointer allocateForMatrixMultiply(GPUContext gCtx,
                                                           jcuda.jcusparse.cusparseHandle handle,
                                                           CSRPointer A,
                                                           int transA,
                                                           CSRPointer B,
                                                           int transB,
                                                           int m,
                                                           int n,
                                                           int k)
                                                    throws DMLRuntimeException
        Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns the CSRPointer to C with the appropriate GPU memory.
        Parameters:
        gCtx - a valid GPUContext
        handle - a valid cusparseHandle
        A - Sparse Matrix A on GPU
        transA - 'T' if A is to be transposed, 'N' otherwise
        B - Sparse Matrix B on GPU
        transB - 'T' if B is to be transposed, 'N' otherwise
        m - Rows in A
        n - Columns in B
        k - Columns in A / Rows in B
        Returns:
        a CSRPointer instance that encapsulates the CSR matrix on GPU
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • allocateEmpty

        public static CSRPointer allocateEmpty(GPUContext gCtx,
                                               long nnz2,
                                               long rows)
                                        throws DMLRuntimeException
        Factory method to allocate an empty CSR Sparse matrix on the GPU
        Parameters:
        gCtx - a valid GPUContext
        nnz2 - number of non-zeroes
        rows - number of rows
        Returns:
        a CSRPointer instance that encapsulates the CSR matrix on GPU
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • isUltraSparse

        public boolean isUltraSparse(int rows,
                                     int cols)
        Check for ultra sparsity
        Parameters:
        rows - number of rows
        cols - number of columns
        Returns:
        true if ultra sparse
      • toColumnMajorDenseMatrix

        public jcuda.Pointer toColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle,
                                                      jcuda.jcublas.cublasHandle cublasHandle,
                                                      int rows,
                                                      int cols)
                                               throws DMLRuntimeException
        Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU. This is a temporary matrix for operations such as cusparseDcsrmv. Since the allocated matrix is temporary, bookkeeping is not updated. The caller is responsible for calling "free" on the returned Pointer object
        Parameters:
        cusparseHandle - a valid cusparseHandle
        cublasHandle - a valid cublasHandle
        rows - number of rows in this CSR matrix
        cols - number of columns in this CSR matrix
        Returns:
        A Pointer to the allocated dense matrix (in column-major format)
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • deallocate

        public void deallocate(boolean eager)
                        throws DMLRuntimeException
        Calls cudaFree lazily or eagerly on the allocated Pointer instances
        Parameters:
        eager - whether to do eager or lazy cudaFrees
        Throws:
        DMLRuntimeException - ?

Copyright © 2017 The Apache Software Foundation. All rights reserved.