org.apache.sysml.runtime.instructions.gpu.context

Class JCudaObject



  • public class JCudaObject
    extends GPUObject
    Handle to a matrix block on the GPU
    • Field Detail

      • jcudaDenseMatrixPtr

        public jcuda.Pointer jcudaDenseMatrixPtr
        Pointer to dense matrix
      • numBytes

        public long numBytes
    • Method Detail

      • getTensorShape

        public int[] getTensorShape()
        Returns a previously allocated tensor shape or null
        Returns:
        int array of four elements or null
      • getTensorDescriptor

        public jcuda.jcudnn.cudnnTensorDescriptor getTensorDescriptor()
        Returns a previously allocated tensor descriptor or null
        Returns:
        cudnn tensor descriptor
      • allocateTensorDescriptor

        public jcuda.jcudnn.cudnnTensorDescriptor allocateTensorDescriptor(int N,
                                                                           int C,
                                                                           int H,
                                                                           int W)
        Returns a previously allocated or allocates and returns a tensor descriptor
        Parameters:
        N - number of images
        C - number of channels
        H - height
        W - width
        Returns:
        cudnn tensor descriptor
      • allocateSparseAndEmpty

        public void allocateSparseAndEmpty()
                                    throws DMLRuntimeException
        Allocates a sparse and empty JCudaObject This is the result of operations that are both non zero matrices.
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • allocateAndFillDense

        public void allocateAndFillDense(double v)
                                  throws DMLRuntimeException
        Allocates a dense matrix of size obtained from the attached matrix metadata and fills it up with a single value
        Parameters:
        v - value to fill up the dense matrix
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • isSparseAndEmpty

        public boolean isSparseAndEmpty()
        If this JCudaObject is sparse and empty Being allocated is a prerequisite to being sparse and empty.
        Returns:
        true if sparse and empty
      • acquireDeviceModifyDense

        public boolean acquireDeviceModifyDense()
                                         throws DMLRuntimeException
        Description copied from class: GPUObject
        To signal intent that a matrix block will be written to on the GPU
        Specified by:
        acquireDeviceModifyDense in class GPUObject
        Returns:
        true if memory was allocated on the GPU as a result of this call
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • acquireDeviceModifySparse

        public boolean acquireDeviceModifySparse()
                                          throws DMLRuntimeException
        Description copied from class: GPUObject
        To signal intent that a sparse matrix block will be written to on the GPU
        Specified by:
        acquireDeviceModifySparse in class GPUObject
        Returns:
        true if memory was allocated on the GPU as a result of this call
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • acquireHostRead

        public boolean acquireHostRead()
                                throws org.apache.sysml.runtime.controlprogram.caching.CacheException
        Description copied from class: GPUObject
        Signal intent that a block needs to be read on the host
        Specified by:
        acquireHostRead in class GPUObject
        Returns:
        true if copied from device to host
        Throws:
        org.apache.sysml.runtime.controlprogram.caching.CacheException - ?
      • releaseInput

        public void releaseInput()
                          throws org.apache.sysml.runtime.controlprogram.caching.CacheException
        releases input allocated on GPU
        Specified by:
        releaseInput in class GPUObject
        Throws:
        org.apache.sysml.runtime.controlprogram.caching.CacheException - if data is not allocated
      • releaseOutput

        public void releaseOutput()
                           throws org.apache.sysml.runtime.controlprogram.caching.CacheException
        releases output allocated on GPU
        Specified by:
        releaseOutput in class GPUObject
        Throws:
        org.apache.sysml.runtime.controlprogram.caching.CacheException - if data is not allocated
      • setDeviceModify

        public void setDeviceModify(long numBytes)
        Description copied from class: GPUObject
        If memory on GPU has been allocated from elsewhere, this method updates the internal bookkeeping
        Specified by:
        setDeviceModify in class GPUObject
        Parameters:
        numBytes - number of bytes
      • copyFromDeviceToHost

        protected void copyFromDeviceToHost()
                                     throws DMLRuntimeException
        Description copied from class: GPUObject
        Copies a matrix block (dense or sparse) from GPU Memory to Host memory. A MatrixBlock instance is allocated, data from the GPU is copied in, the current one in Host memory is deallocated by calling MatrixObject's acquireHostModify(MatrixBlock) (??? does not exist) and overwritten with the newly allocated instance. TODO : re-examine this to avoid spurious allocations of memory for optimizations
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • getSparseMatrixCudaPointer

        public JCudaObject.CSRPointer getSparseMatrixCudaPointer()
        Convenience method to directly examine the Sparse matrix on GPU
        Returns:
        CSR (compressed sparse row) pointer
      • setSparseMatrixCudaPointer

        public void setSparseMatrixCudaPointer(JCudaObject.CSRPointer sparseMatrixPtr)
        Convenience method to directly set the sparse matrix on GPU Make sure to call setDeviceModify(long) after this to set appropriate state, if you are not sure what you are doing. Needed for operations like JCusparse.cusparseDcsrgemm(cusparseHandle, int, int, int, int, int, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, Pointer, Pointer, Pointer)
        Parameters:
        sparseMatrixPtr - CSR (compressed sparse row) pointer
      • setDenseMatrixCudaPointer

        public void setDenseMatrixCudaPointer(jcuda.Pointer densePtr)
        Convenience method to directly set the dense matrix pointer on GPU Make sure to call setDeviceModify(long) after this to set appropriate state, if you are not sure what you are doing.
        Parameters:
        densePtr - dense pointer
      • transpose

        public static jcuda.Pointer transpose(jcuda.Pointer densePtr,
                                              int m,
                                              int n,
                                              int lda,
                                              int ldc)
                                       throws DMLRuntimeException
        Transposes a dense matrix on the GPU by calling the cublasDgeam operation
        Parameters:
        densePtr - Pointer to dense matrix on the GPU
        m - rows in ouput matrix
        n - columns in output matrix
        lda - rows in input matrix
        ldc - columns in output matrix
        Returns:
        transposed matrix
        Throws:
        DMLRuntimeException - if operation failed
      • sparseToDense

        public void sparseToDense()
                           throws DMLRuntimeException
        Convert sparse to dense (Performs transpose, use sparseToColumnMajorDense if the kernel can deal with column major format)
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • sparseToDense

        public void sparseToDense(String instructionName)
                           throws DMLRuntimeException
        Convert sparse to dense (Performs transpose, use sparseToColumnMajorDense if the kernel can deal with column major format) Also records per instruction invokation of sparseToDense.
        Parameters:
        instructionName - Name of the instruction for which statistics are recorded in GPUStatistics
        Throws:
        DMLRuntimeException - ?
      • sparseToColumnMajorDense

        public void sparseToColumnMajorDense()
                                      throws DMLRuntimeException
        More efficient method to convert sparse to dense but returns dense in column major format
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • columnMajorDenseToRowMajorSparse

        public static JCudaObject.CSRPointer columnMajorDenseToRowMajorSparse(jcuda.jcusparse.cusparseHandle cusparseHandle,
                                                                              int rows,
                                                                              int cols,
                                                                              jcuda.Pointer densePtr)
                                                                       throws DMLRuntimeException
        Convenience method to convert a CSR matrix to a dense matrix on the GPU Since the allocated matrix is temporary, bookkeeping is not updated. Also note that the input dense matrix is expected to be in COLUMN MAJOR FORMAT Caller is responsible for deallocating memory on GPU.
        Parameters:
        cusparseHandle - handle to cusparse library
        rows - number of rows
        cols - number of columns
        densePtr - [in] dense matrix pointer on the GPU in row major
        Returns:
        CSR (compressed sparse row) pointer
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • allocate

        public static jcuda.Pointer allocate(String instructionName,
                                             long size)
                                      throws DMLRuntimeException
        Convenience method for allocate(String, long, int), defaults statsCount to 1.
        Parameters:
        instructionName - name of instruction for which to record per instruction performance statistics, null if don't want to record
        size - size of data (in bytes) to allocate
        Returns:
        jcuda pointer
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • allocate

        public static jcuda.Pointer allocate(String instructionName,
                                             long size,
                                             int statsCount)
                                      throws DMLRuntimeException
        Allocates temporary space on the device. Does not update bookkeeping. The caller is responsible for freeing up after usage.
        Parameters:
        instructionName - name of instruction for which to record per instruction performance statistics, null if don't want to record
        size - Size of data (in bytes) to allocate
        statsCount - amount to increment the cudaAllocCount by
        Returns:
        jcuda Pointer
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • cudaFreeHelper

        public static void cudaFreeHelper(jcuda.Pointer toFree)
        Does lazy cudaFree calls
        Parameters:
        toFree - Pointer instance to be freed
      • cudaFreeHelper

        public static void cudaFreeHelper(jcuda.Pointer toFree,
                                          boolean eager)
        does lazy/eager cudaFree calls
        Parameters:
        toFree - Pointer instance to be freed
        eager - true if to be done eagerly
      • cudaFreeHelper

        public static void cudaFreeHelper(String instructionName,
                                          jcuda.Pointer toFree)
        Does lazy cudaFree calls
        Parameters:
        instructionName - name of the instruction for which to record per instruction free time, null if do not want to record
        toFree - Pointer instance to be freed
      • cudaFreeHelper

        public static void cudaFreeHelper(String instructionName,
                                          jcuda.Pointer toFree,
                                          boolean eager)
        Does cudaFree calls, lazily
        Parameters:
        instructionName - name of the instruction for which to record per instruction free time, null if do not want to record
        toFree - Pointer instance to be freed
        eager - true if to be done eagerly
      • debugString

        public static String debugString(jcuda.Pointer A,
                                         long rows,
                                         long cols)
                                  throws DMLRuntimeException
        Gets the double array from GPU memory onto host memory and returns string.
        Parameters:
        A - Pointer to memory on device (GPU), assumed to point to a double array
        rows - rows in matrix A
        cols - columns in matrix A
        Returns:
        the debug string
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs

Copyright © 2017 The Apache Software Foundation. All rights reserved.