GPUContext (SystemML 0.15.0 API)

java.lang.Object
- org.apache.sysml.runtime.instructions.gpu.context.GPUContext

```
public class GPUContext
extends Object
```
Represents a context per GPU accessible through the same JVM. Each context holds cublas, cusparse, cudnn... handles which are separate for each GPU.

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class GPUContext.EvictionPolicy
Eviction policies for evict(long).

Nested Classes
Modifier and Type	Class and Description
`static class`	`GPUContext.EvictionPolicy` Eviction policies for `evict(long)`.

Field Summary

Fields
Modifier and Type	Field and Description
`GPUContext.EvictionPolicy`	`evictionPolicy` currently employed eviction policy
`double`	`GPU_MEMORY_UTILIZATION_FACTOR`
`protected static org.apache.commons.logging.Log`	`LOG`

Constructor Summary

Constructors
Modifier Constructor and Description

protected GPUContext(int deviceNum)

Constructors
Modifier	Constructor and Description
`protected`	`GPUContext(int deviceNum)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`jcuda.Pointer`	`allocate(long size)` Convenience method for `allocate(String, long, int)`, defaults statsCount to 1.
`jcuda.Pointer`	`allocate(String instructionName, long size)` Convenience method for `allocate(String, long, int)`, defaults statsCount to 1.
`jcuda.Pointer`	`allocate(String instructionName, long size, int statsCount)` Allocates temporary space on the device.
`void`	`clearMemory()` Clears all memory used by this `GPUContext`.
`void`	`clearTemporaryMemory()` Clears up the memory used to optimize cudaMalloc/cudaFree calls.
`GPUObject`	`createGPUObject(org.apache.sysml.runtime.controlprogram.caching.MatrixObject mo)` Instantiates a new `GPUObject` initialized with the given `MatrixObject`.
`void`	`cudaFreeHelper(jcuda.Pointer toFree)` Does lazy cudaFree calls.
`void`	`cudaFreeHelper(jcuda.Pointer toFree, boolean eager)` Does lazy/eager cudaFree calls.
`void`	`cudaFreeHelper(String instructionName, jcuda.Pointer toFree)` Does lazy cudaFree calls.
`void`	`cudaFreeHelper(String instructionName, jcuda.Pointer toFree, boolean eager)` Does cudaFree calls, lazily.
`static int`	`cudaGetDevice()` Returns which device is currently being used.
`void`	`destroy()` Destroys this GPUContext object.
`void`	`ensureComputeCapability()` Makes sure that GPU that SystemML is trying to use has the minimum compute capability needed.
`protected void`	`evict(long GPUSize)` Convenience wrapper over `evict(String, long)`.
`protected void`	`evict(String instructionName, long neededSize)` Memory on the GPU is tried to be freed up until either a chunk of needed size is freed up or it fails.
`long`	`getAvailableMemory()` Gets the available memory on GPU that SystemML can use.
`jcuda.jcublas.cublasHandle`	`getCublasHandle()` Returns cublasHandle for BLAS operations on the GPU.
`jcuda.jcudnn.cudnnHandle`	`getCudnnHandle()` Returns the cudnnHandle for Deep Neural Network operations on the GPU.
`jcuda.jcusolver.cusolverDnHandle`	`getCusolverDnHandle()` Returns cusolverDnHandle for invoking solve() function on dense matrices on the GPU.
`jcuda.jcusolver.cusolverSpHandle`	`getCusolverSpHandle()` Returns cusolverSpHandle for invoking solve() function on sparse matrices on the GPU.
`jcuda.jcusparse.cusparseHandle`	`getCusparseHandle()` Returns cusparseHandle for certain sparse BLAS operations on the GPU.
`int`	`getDeviceNum()` Returns which device is assigned to this GPUContext instance.
`jcuda.runtime.cudaDeviceProp`	`getGPUProperties()` Gets the device properties for the active GPU (set with cudaSetDevice()).
`JCudaKernels`	`getKernels()` Returns utility class used to launch custom CUDA kernel, specific to the active GPU for this GPUContext.
`int`	`getMaxBlocks()` Gets the maximum number of blocks supported by the active cuda device.
`long`	`getMaxSharedMemory()` Gets the shared memory per block supported by the active cuda device.
`int`	`getMaxThreadsPerBlock()` Gets the maximum number of threads per block for "active" GPU.
`int`	`getWarpSize()` Gets the warp size supported by the active cuda device.
`void`	`initializeThread()` Sets the device for the calling thread.
`boolean`	`isBlockRecorded(GPUObject o)` Whether the GPU associated with this `GPUContext` has recorded the usage of a certain block.
`void`	`recordBlockUsage(GPUObject o)`
`void`	`removeRecordedUsage(GPUObject o)`
`String`	`toString()`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - LOG
```
protected static final org.apache.commons.logging.Log LOG
```
  - evictionPolicy
```
public final GPUContext.EvictionPolicy evictionPolicy
```
    currently employed eviction policy
  - GPU_MEMORY_UTILIZATION_FACTOR
```
public double GPU_MEMORY_UTILIZATION_FACTOR
```
- Constructor Detail
  - GPUContext
```
protected GPUContext(int deviceNum)
              throws DMLRuntimeException
```
    Throws:
    
    DMLRuntimeException
- Method Detail
  - cudaGetDevice
```
public static int cudaGetDevice()
```
    Returns which device is currently being used.
    
    Returns:
    
    the current device for the calling host thread
  - getDeviceNum
```
public int getDeviceNum()
```
    Returns which device is assigned to this GPUContext instance.
    
    Returns:
    
    active device assigned to this GPUContext instance
  - initializeThread
```
public void initializeThread()
                      throws DMLRuntimeException
```
    Sets the device for the calling thread. This method must be called after ExecutionContext.getGPUContext(int) If in a multi-threaded environment like parfor, this method must be called when in the appropriate thread.
    
    Throws:
    
    DMLRuntimeException - if DMLRuntimeException occurs
  - allocate
```
public jcuda.Pointer allocate(long size)
                       throws DMLRuntimeException
```
    Convenience method for allocate(String, long, int), defaults statsCount to 1.
    
    Parameters:
    
    size - size of data (in bytes) to allocate
    
    Returns:
    
    jcuda pointer
    
    Throws:
    
    DMLRuntimeException - if DMLRuntimeException occurs
  - allocate
```
public jcuda.Pointer allocate(String instructionName,
                              long size)
                       throws DMLRuntimeException
```
    Convenience method for allocate(String, long, int), defaults statsCount to 1.
    
    Parameters:
    
    instructionName - name of instruction for which to record per instruction performance statistics, null if don't want to record
    
    size - size of data (in bytes) to allocate
    
    Returns:
    
    jcuda pointer
    
    Throws:
    
    DMLRuntimeException - if DMLRuntimeException occurs
  - allocate
```
public jcuda.Pointer allocate(String instructionName,
                              long size,
                              int statsCount)
                       throws DMLRuntimeException
```
    Allocates temporary space on the device. Does not update bookkeeping. The caller is responsible for freeing up after usage.
    
    Parameters:
    
    instructionName - name of instruction for which to record per instruction performance statistics, null if don't want to record
    
    size - Size of data (in bytes) to allocate
    
    statsCount - amount to increment the cudaAllocCount by
    
    Returns:
    
    jcuda Pointer
    
    Throws:
    
    DMLRuntimeException - if DMLRuntimeException occurs
  - cudaFreeHelper
```
public void cudaFreeHelper(jcuda.Pointer toFree)
```
    Does lazy cudaFree calls.
    
    Parameters:
    
    toFree - Pointer instance to be freed
  - cudaFreeHelper
```
public void cudaFreeHelper(jcuda.Pointer toFree,
                           boolean eager)
```
    Does lazy/eager cudaFree calls.
    
    Parameters:
    
    toFree - Pointer instance to be freed
    
    eager - true if to be done eagerly
  - cudaFreeHelper
```
public void cudaFreeHelper(String instructionName,
                           jcuda.Pointer toFree)
```
    Does lazy cudaFree calls.
    
    Parameters:
    
    instructionName - name of the instruction for which to record per instruction free time, null if do not want to record
    
    toFree - Pointer instance to be freed
  - cudaFreeHelper
```
public void cudaFreeHelper(String instructionName,
                           jcuda.Pointer toFree,
                           boolean eager)
```
    Does cudaFree calls, lazily.
    
    Parameters:
    
    instructionName - name of the instruction for which to record per instruction free time, null if do not want to record
    
    toFree - Pointer instance to be freed
    
    eager - true if to be done eagerly
  - evict
```
protected void evict(long GPUSize)
              throws DMLRuntimeException
```
    Convenience wrapper over evict(String, long).
    
    Parameters:
    
    GPUSize - Desired size to be freed up on the GPU
    
    Throws:
    
    DMLRuntimeException - If no blocks to free up or if not enough blocks with zero locks on them.
  - evict
```
protected void evict(String instructionName,
                     long neededSize)
              throws DMLRuntimeException
```
    Memory on the GPU is tried to be freed up until either a chunk of needed size is freed up or it fails. First the set of reusable blocks is freed up. If that isn't enough, the set of allocated matrix blocks with zero locks on them is freed up. The process cycles through the sorted list of allocated GPUObject instances. Sorting is based on number of (read) locks that have been obtained on it (reverse order). It repeatedly frees up blocks on which there are zero locks until the required size has been freed up. // TODO: update it with hybrid policy
    
    Parameters:
    
    instructionName - name of the instruction for which performance measurements are made
    
    neededSize - desired size to be freed up on the GPU
    
    Throws:
    
    DMLRuntimeException - If no reusable memory blocks to free up or if not enough matrix blocks with zero locks on them.
  - isBlockRecorded
```
public boolean isBlockRecorded(GPUObject o)
```
    Whether the GPU associated with this GPUContext has recorded the usage of a certain block.
    
    Parameters:
    
    o - the block
    
    Returns:
    
    true if present, false otherwise
  - recordBlockUsage
```
public void recordBlockUsage(GPUObject o)
```
    Parameters:
    
    o - GPUObject instance to record
    
    See Also:
    
    Records the usage of a matrix block
  - removeRecordedUsage
```
public void removeRecordedUsage(GPUObject o)
```
    Parameters:
    
    o - GPUObject instance to remove from the list of allocated GPU objects
    
    See Also:
    
    Records that a block is not used anymore
  - getAvailableMemory
```
public long getAvailableMemory()
```
    Gets the available memory on GPU that SystemML can use.
    
    Returns:
    
    the available memory in bytes
  - ensureComputeCapability
```
public void ensureComputeCapability()
                             throws DMLRuntimeException
```
    Makes sure that GPU that SystemML is trying to use has the minimum compute capability needed.
    
    Throws:
    
    DMLRuntimeException - if the compute capability is less than what is required
  - createGPUObject
```
public GPUObject createGPUObject(org.apache.sysml.runtime.controlprogram.caching.MatrixObject mo)
```
    Instantiates a new GPUObject initialized with the given MatrixObject.
    
    Parameters:
    
    mo - a MatrixObject that represents a matrix
    
    Returns:
    
    a new GPUObject instance
  - getGPUProperties
```
public jcuda.runtime.cudaDeviceProp getGPUProperties()
                                              throws DMLRuntimeException
```
    Gets the device properties for the active GPU (set with cudaSetDevice()).
    
    Returns:
    
    the device properties
    
    Throws:
    
    DMLRuntimeException - ?
  - getMaxThreadsPerBlock
```
public int getMaxThreadsPerBlock()
                          throws DMLRuntimeException
```
    Gets the maximum number of threads per block for "active" GPU.
    
    Returns:
    
    the maximum number of threads per block
    
    Throws:
    
    DMLRuntimeException - ?
  - getMaxBlocks
```
public int getMaxBlocks()
                 throws DMLRuntimeException
```
    Gets the maximum number of blocks supported by the active cuda device.
    
    Returns:
    
    the maximum number of blocks supported
    
    Throws:
    
    DMLRuntimeException - ?
  - getMaxSharedMemory
```
public long getMaxSharedMemory()
                        throws DMLRuntimeException
```
    Gets the shared memory per block supported by the active cuda device.
    
    Returns:
    
    the shared memory per block
    
    Throws:
    
    DMLRuntimeException - ?
  - getWarpSize
```
public int getWarpSize()
                throws DMLRuntimeException
```
    Gets the warp size supported by the active cuda device.
    
    Returns:
    
    the warp size
    
    Throws:
    
    DMLRuntimeException - ?
  - getCudnnHandle
```
public jcuda.jcudnn.cudnnHandle getCudnnHandle()
```
    Returns the cudnnHandle for Deep Neural Network operations on the GPU.
    
    Returns:
    
    cudnnHandle for current thread
  - getCublasHandle
```
public jcuda.jcublas.cublasHandle getCublasHandle()
```
    Returns cublasHandle for BLAS operations on the GPU.
    
    Returns:
    
    cublasHandle for current thread
  - getCusparseHandle
```
public jcuda.jcusparse.cusparseHandle getCusparseHandle()
```
    Returns cusparseHandle for certain sparse BLAS operations on the GPU.
    
    Returns:
    
    cusparseHandle for current thread
  - getCusolverDnHandle
```
public jcuda.jcusolver.cusolverDnHandle getCusolverDnHandle()
```
    Returns cusolverDnHandle for invoking solve() function on dense matrices on the GPU.
    
    Returns:
    
    cusolverDnHandle for current thread
  - getCusolverSpHandle
```
public jcuda.jcusolver.cusolverSpHandle getCusolverSpHandle()
```
    Returns cusolverSpHandle for invoking solve() function on sparse matrices on the GPU.
    
    Returns:
    
    cusolverSpHandle for current thread
  - getKernels
```
public JCudaKernels getKernels()
```
    Returns utility class used to launch custom CUDA kernel, specific to the active GPU for this GPUContext.
    
    Returns:
    
    JCudaKernels for current thread
  - destroy
```
public void destroy()
             throws DMLRuntimeException
```
    Destroys this GPUContext object.
    
    Throws:
    
    DMLRuntimeException - if error
  - clearMemory
```
public void clearMemory()
                 throws DMLRuntimeException
```
    Clears all memory used by this GPUContext. Be careful to ensure that no memory is currently being used in the temporary memory before invoking this. If memory is being used between MLContext invocations, they are pointed to by a GPUObject instance which would be part of the MatrixObject. The cleanup of that MatrixObject instance will cause the memory associated with that block on the GPU to be freed up.
    
    Throws:
    
    DMLRuntimeException - ?
  - clearTemporaryMemory
```
public void clearTemporaryMemory()
```
    Clears up the memory used to optimize cudaMalloc/cudaFree calls.
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object

Methods inherited from class java.lang.Object

LOG

evictionPolicy

GPU_MEMORY_UTILIZATION_FACTOR

GPUContext

cudaGetDevice

getDeviceNum

initializeThread

allocate

allocate

allocate

cudaFreeHelper

cudaFreeHelper

cudaFreeHelper

cudaFreeHelper

evict

evict

isBlockRecorded

recordBlockUsage

removeRecordedUsage

getAvailableMemory

ensureComputeCapability

createGPUObject

getGPUProperties

getMaxThreadsPerBlock

getMaxBlocks

getMaxSharedMemory

getWarpSize

getCudnnHandle

getCublasHandle

getCusparseHandle

getCusolverDnHandle

getCusolverSpHandle

getKernels

destroy

clearMemory

clearTemporaryMemory

toString