public class GPUObject extends Object
Modifier and Type | Field and Description |
---|---|
protected boolean |
dirty
whether the block attached to this
GPUContext is dirty on the device and needs to be copied back to host |
protected boolean |
isSparse
Whether this block is in sparse format
|
protected AtomicLong |
locks
number of read/write locks on this object (this GPUObject is being used in a current instruction)
|
protected org.apache.sysml.runtime.controlprogram.caching.MatrixObject |
mat
Enclosing
MatrixObject instance |
Modifier and Type | Method and Description |
---|---|
boolean |
acquireDeviceModifyDense() |
boolean |
acquireDeviceModifySparse() |
boolean |
acquireDeviceRead() |
boolean |
acquireHostRead()
if the data is allocated on the GPU and is dirty, it is copied back to the host memory
|
void |
addLock() |
void |
allocateAndFillDense(double v)
Allocates a dense matrix of size obtained from the attached matrix metadata
and fills it up with a single value
|
void |
allocateSparseAndEmpty()
Allocates a sparse and empty
GPUObject
This is the result of operations that are both non zero matrices. |
jcuda.jcudnn.cudnnTensorDescriptor |
allocateTensorDescriptor(int N,
int C,
int H,
int W)
Returns a previously allocated or allocates and returns a tensor descriptor
|
void |
clearData()
lazily clears the data associated with this
GPUObject instance |
void |
clearData(boolean eager)
Clears the data associated with this
GPUObject instance |
Object |
clone() |
static CSRPointer |
columnMajorDenseToRowMajorSparse(GPUContext gCtx,
jcuda.jcusparse.cusparseHandle cusparseHandle,
jcuda.Pointer densePtr,
int rows,
int cols)
Convenience method to convert a CSR matrix to a dense matrix on the GPU
Since the allocated matrix is temporary, bookkeeping is not updated.
|
protected void |
copyFromDeviceToHost() |
static String |
debugString(jcuda.Pointer A,
long rows,
long cols)
Gets the double array from GPU memory onto host memory and returns string.
|
void |
denseColumnMajorToRowMajor()
Convenience method.
|
void |
denseRowMajorToColumnMajor()
Convenience method.
|
void |
denseToSparse()
Converts this GPUObject from dense to sparse format.
|
jcuda.Pointer |
getJcudaDenseMatrixPtr()
Pointer to dense matrix
|
CSRPointer |
getJcudaSparseMatrixPtr()
Pointer to sparse matrix
|
protected long |
getSizeOnDevice() |
CSRPointer |
getSparseMatrixCudaPointer()
Convenience method to directly examine the Sparse matrix on GPU
|
jcuda.jcudnn.cudnnTensorDescriptor |
getTensorDescriptor()
Returns a previously allocated tensor descriptor or null
|
int[] |
getTensorShape()
Returns a previously allocated tensor shape or null
|
boolean |
isAllocated() |
boolean |
isDirty()
Whether this block is dirty on the GPU
|
boolean |
isInputAllocated() |
boolean |
isSparse() |
boolean |
isSparseAndEmpty()
If this
GPUObject is sparse and empty
Being allocated is a prerequisite to being sparse and empty. |
void |
releaseInput()
Releases input allocated on GPU
|
void |
releaseOutput()
releases output allocated on GPU
|
void |
setDenseMatrixCudaPointer(jcuda.Pointer densePtr)
Convenience method to directly set the dense matrix pointer on GPU
|
void |
setSparseMatrixCudaPointer(CSRPointer sparseMatrixPtr)
Convenience method to directly set the sparse matrix on GPU
Needed for operations like
JCusparse.cusparseDcsrgemm(cusparseHandle, int, int, int, int, int, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, Pointer, Pointer, Pointer) |
void |
sparseToColumnMajorDense()
More efficient method to convert sparse to dense but returns dense in column major format
|
void |
sparseToDense()
Convert sparse to dense (Performs transpose, use sparseToColumnMajorDense if the kernel can deal with column major format)
|
void |
sparseToDense(String instructionName)
Convert sparse to dense (Performs transpose, use sparseToColumnMajorDense if the kernel can deal with column major format)
Also records per instruction invokation of sparseToDense.
|
static int |
toIntExact(long l) |
String |
toString() |
static jcuda.Pointer |
transpose(GPUContext gCtx,
jcuda.Pointer densePtr,
int m,
int n,
int lda,
int ldc)
Transposes a dense matrix on the GPU by calling the cublasDgeam operation
|
protected boolean dirty
GPUContext
is dirty on the device and needs to be copied back to hostprotected AtomicLong locks
protected boolean isSparse
protected org.apache.sysml.runtime.controlprogram.caching.MatrixObject mat
MatrixObject
instancepublic static jcuda.Pointer transpose(GPUContext gCtx, jcuda.Pointer densePtr, int m, int n, int lda, int ldc) throws DMLRuntimeException
gCtx
- a valid GPUContext
densePtr
- Pointer to dense matrix on the GPUm
- rows in ouput matrixn
- columns in output matrixlda
- rows in input matrixldc
- columns in output matrixDMLRuntimeException
- if operation failedpublic static CSRPointer columnMajorDenseToRowMajorSparse(GPUContext gCtx, jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.Pointer densePtr, int rows, int cols) throws DMLRuntimeException
gCtx
- a valid GPUContext
cusparseHandle
- handle to cusparse librarydensePtr
- [in] dense matrix pointer on the GPU in row majorrows
- number of rowscols
- number of columnsDMLRuntimeException
- if DMLRuntimeException occurspublic static String debugString(jcuda.Pointer A, long rows, long cols) throws DMLRuntimeException
A
- Pointer to memory on device (GPU), assumed to point to a double arrayrows
- rows in matrix Acols
- columns in matrix ADMLRuntimeException
- if DMLRuntimeException occurspublic CSRPointer getSparseMatrixCudaPointer()
public void setSparseMatrixCudaPointer(CSRPointer sparseMatrixPtr) throws DMLRuntimeException
JCusparse.cusparseDcsrgemm(cusparseHandle, int, int, int, int, int, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, Pointer, Pointer, Pointer)
sparseMatrixPtr
- CSR (compressed sparse row) pointerDMLRuntimeException
- ?public void setDenseMatrixCudaPointer(jcuda.Pointer densePtr) throws DMLRuntimeException
densePtr
- dense pointerDMLRuntimeException
- ?public void denseToSparse() throws DMLRuntimeException
DMLRuntimeException
- if DMLRuntimeException occurspublic void denseRowMajorToColumnMajor() throws DMLRuntimeException
DMLRuntimeException
- if DMLRuntimeException occurspublic void denseColumnMajorToRowMajor() throws DMLRuntimeException
DMLRuntimeException
- if errorpublic void sparseToDense() throws DMLRuntimeException
DMLRuntimeException
- if DMLRuntimeException occurspublic void sparseToDense(String instructionName) throws DMLRuntimeException
instructionName
- Name of the instruction for which statistics are recorded in GPUStatistics
DMLRuntimeException
- ?public void sparseToColumnMajorDense() throws DMLRuntimeException
DMLRuntimeException
- if DMLRuntimeException occurspublic boolean isSparse()
public int[] getTensorShape()
public jcuda.jcudnn.cudnnTensorDescriptor getTensorDescriptor()
public jcuda.jcudnn.cudnnTensorDescriptor allocateTensorDescriptor(int N, int C, int H, int W)
N
- number of imagesC
- number of channelsH
- heightW
- widthpublic boolean isAllocated()
public boolean isInputAllocated()
public void allocateSparseAndEmpty() throws DMLRuntimeException
GPUObject
This is the result of operations that are both non zero matrices.DMLRuntimeException
- if DMLRuntimeException occurspublic void allocateAndFillDense(double v) throws DMLRuntimeException
v
- value to fill up the dense matrixDMLRuntimeException
- if DMLRuntimeException occurspublic boolean isSparseAndEmpty() throws DMLRuntimeException
GPUObject
is sparse and empty
Being allocated is a prerequisite to being sparse and empty.DMLRuntimeException
- if errorpublic boolean acquireDeviceRead() throws DMLRuntimeException
DMLRuntimeException
public boolean acquireDeviceModifyDense() throws DMLRuntimeException
DMLRuntimeException
public boolean acquireDeviceModifySparse() throws DMLRuntimeException
DMLRuntimeException
public void addLock()
public boolean acquireHostRead() throws org.apache.sysml.runtime.controlprogram.caching.CacheException
org.apache.sysml.runtime.controlprogram.caching.CacheException
- ?public void releaseInput() throws DMLRuntimeException
DMLRuntimeException
- if data is not allocated or if there is no locked GPU Object or if could not obtain a GPUContext
public void releaseOutput() throws DMLRuntimeException
DMLRuntimeException
- if data is not allocated or if there is no locked GPU Object or if could not obtain a GPUContext
protected long getSizeOnDevice() throws DMLRuntimeException
DMLRuntimeException
public static int toIntExact(long l) throws DMLRuntimeException
DMLRuntimeException
protected void copyFromDeviceToHost() throws DMLRuntimeException
DMLRuntimeException
public void clearData() throws DMLRuntimeException
GPUObject
instanceorg.apache.sysml.runtime.controlprogram.caching.CacheException
- ?DMLRuntimeException
public void clearData(boolean eager) throws DMLRuntimeException
GPUObject
instanceeager
- whether to be done synchronously or asynchronouslyorg.apache.sysml.runtime.controlprogram.caching.CacheException
- ?DMLRuntimeException
public jcuda.Pointer getJcudaDenseMatrixPtr()
public CSRPointer getJcudaSparseMatrixPtr()
public boolean isDirty()
Copyright © 2017 The Apache Software Foundation. All rights reserved.