org.apache.sysml.runtime.matrix.data

Class FrameBlock

  • All Implemented Interfaces:
    Externalizable, Serializable, org.apache.hadoop.io.Writable, org.apache.sysml.runtime.controlprogram.caching.CacheBlock


    public class FrameBlock
    extends Object
    implements org.apache.hadoop.io.Writable, org.apache.sysml.runtime.controlprogram.caching.CacheBlock, Externalizable
    See Also:
    Serialized Form
    • Constructor Detail

      • FrameBlock

        public FrameBlock()
      • FrameBlock

        public FrameBlock(FrameBlock that)
        Copy constructor for frame blocks, which uses a shallow copy for the schema (column types and names) but a deep copy for meta data and actual column data.
        Parameters:
        that - frame block
      • FrameBlock

        public FrameBlock(int ncols,
                          org.apache.sysml.parser.Expression.ValueType vt)
      • FrameBlock

        public FrameBlock(org.apache.sysml.parser.Expression.ValueType[] schema)
      • FrameBlock

        public FrameBlock(org.apache.sysml.parser.Expression.ValueType[] schema,
                          String[] names)
      • FrameBlock

        public FrameBlock(org.apache.sysml.parser.Expression.ValueType[] schema,
                          String[][] data)
      • FrameBlock

        public FrameBlock(org.apache.sysml.parser.Expression.ValueType[] schema,
                          String[] names,
                          String[][] data)
    • Method Detail

      • getNumRows

        public int getNumRows()
        Get the number of rows of the frame block.
        Specified by:
        getNumRows in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Returns:
        number of rows
      • setNumRows

        public void setNumRows(int numRows)
      • getNumColumns

        public int getNumColumns()
        Get the number of columns of the frame block, that is the number of columns defined in the schema.
        Specified by:
        getNumColumns in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Returns:
        number of columns
      • getSchema

        public org.apache.sysml.parser.Expression.ValueType[] getSchema()
        Returns the schema of the frame block.
        Returns:
        schema as array of ValueTypes
      • setSchema

        public void setSchema(org.apache.sysml.parser.Expression.ValueType[] schema)
        Sets the schema of the frame block.
        Parameters:
        schema - schema as array of ValueTypes
      • getColumnNames

        public String[] getColumnNames()
        Returns the column names of the frame block. This method allocates default column names if required.
        Returns:
        column names
      • getColumnNames

        public String[] getColumnNames(boolean alloc)
        Returns the column names of the frame block. This method allocates default column names if required.
        Parameters:
        alloc - if true, create column names
        Returns:
        array of column names
      • getColumnName

        public String getColumnName(int c)
        Returns the column name for the requested column. This method allocates default column names if required.
        Parameters:
        c - column index
        Returns:
        column name
      • setColumnNames

        public void setColumnNames(String[] colnames)
      • isColumnMetadataDefault

        public boolean isColumnMetadataDefault()
      • isColumnMetadataDefault

        public boolean isColumnMetadataDefault(int c)
      • getColumnNameIDMap

        public Map<String,Integer> getColumnNameIDMap()
        Creates a mapping from column names to column IDs, i.e., 1-based column indexes
        Returns:
        map of column name keys and id values
      • ensureAllocatedColumns

        public void ensureAllocatedColumns(int numRows)
        Allocate column data structures if necessary, i.e., if schema specified but not all column data structures created yet.
        Parameters:
        numRows - number of rows
      • ensureColumnCompatibility

        public void ensureColumnCompatibility(int newlen)
        Checks for matching column sizes in case of existing columns.
        Parameters:
        newlen - number of rows to compare with existing number of rows
      • createColNames

        public static String[] createColNames(int size)
      • createColNames

        public static String[] createColNames(int off,
                                              int size)
      • createColName

        public static String createColName(int i)
      • isColNamesDefault

        public boolean isColNamesDefault()
      • isColNameDefault

        public boolean isColNameDefault(int i)
      • recomputeColumnCardinality

        public void recomputeColumnCardinality()
      • get

        public Object get(int r,
                          int c)
        Gets a boxed object of the value in position (r,c).
        Parameters:
        r - row index, 0-based
        c - column index, 0-based
        Returns:
        object of the value at specified position
      • set

        public void set(int r,
                        int c,
                        Object val)
        Sets the value in position (r,c), where the input is assumed to be a boxed object consistent with the schema definition.
        Parameters:
        r - row index
        c - column index
        val - value to set at specified position
      • reset

        public void reset(int nrow,
                          boolean clearMeta)
      • reset

        public void reset()
      • appendRow

        public void appendRow(Object[] row)
        Append a row to the end of the data frame, where all row fields are boxed objects according to the schema.
        Parameters:
        row - array of objects
      • appendRow

        public void appendRow(String[] row)
        Append a row to the end of the data frame, where all row fields are string encoded.
        Parameters:
        row - array of strings
      • appendColumn

        public void appendColumn(String[] col)
        Append a column of value type STRING as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.
        Parameters:
        col - array of strings
      • appendColumn

        public void appendColumn(boolean[] col)
        Append a column of value type BOOLEAN as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.
        Parameters:
        col - array of booleans
      • appendColumn

        public void appendColumn(long[] col)
        Append a column of value type INT as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.
        Parameters:
        col - array of longs
      • appendColumn

        public void appendColumn(double[] col)
        Append a column of value type DOUBLE as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.
        Parameters:
        col - array of doubles
      • appendColumns

        public void appendColumns(double[][] cols)
        Append a set of column of value type DOUBLE at the end of the frame in order to avoid repeated allocation with appendColumns. The given array is wrapped but not copied and hence might be updated in the future.
        Parameters:
        cols - 2d array of doubles
      • getColumnData

        public Object getColumnData(int c)
      • getColumn

        public org.apache.sysml.runtime.matrix.data.FrameBlock.Array getColumn(int c)
      • setColumn

        public void setColumn(int c,
                              org.apache.sysml.runtime.matrix.data.FrameBlock.Array column)
      • getStringRowIterator

        public Iterator<String[]> getStringRowIterator()
        Get a row iterator over the frame where all fields are encoded as strings independent of their value types.
        Returns:
        string array iterator
      • getStringRowIterator

        public Iterator<String[]> getStringRowIterator(int[] cols)
        Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.
        Parameters:
        cols - column selection, 1-based
        Returns:
        string array iterator
      • getStringRowIterator

        public Iterator<String[]> getStringRowIterator(int rl,
                                                       int ru)
        Get a row iterator over the frame where all fields are encoded as strings independent of their value types.
        Parameters:
        rl - lower row index
        ru - upper row index
        Returns:
        string array iterator
      • getStringRowIterator

        public Iterator<String[]> getStringRowIterator(int rl,
                                                       int ru,
                                                       int[] cols)
        Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.
        Parameters:
        rl - lower row index
        ru - upper row index
        cols - column selection, 1-based
        Returns:
        string array iterator
      • getObjectRowIterator

        public Iterator<Object[]> getObjectRowIterator()
        Get a row iterator over the frame where all fields are encoded as boxed objects according to their value types.
        Returns:
        object array iterator
      • getObjectRowIterator

        public Iterator<Object[]> getObjectRowIterator(int[] cols)
        Get a row iterator over the frame where all selected fields are encoded as boxed objects according to their value types.
        Parameters:
        cols - column selection, 1-based
        Returns:
        object array iterator
      • getObjectRowIterator

        public Iterator<Object[]> getObjectRowIterator(int rl,
                                                       int ru)
        Get a row iterator over the frame where all fields are encoded as boxed objects according to their value types.
        Parameters:
        rl - lower row index
        ru - upper row index
        Returns:
        object array iterator
      • getObjectRowIterator

        public Iterator<Object[]> getObjectRowIterator(int rl,
                                                       int ru,
                                                       int[] cols)
        Get a row iterator over the frame where all selected fields are encoded as boxed objects according to their value types.
        Parameters:
        rl - lower row index
        ru - upper row index
        cols - column selection, 1-based
        Returns:
        object array iterator
      • readFields

        public void readFields(DataInput in)
                        throws IOException
        Specified by:
        readFields in interface org.apache.hadoop.io.Writable
        Throws:
        IOException
      • getInMemorySize

        public long getInMemorySize()
        Description copied from interface: org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Get the in-memory size in bytes of the cache block.
        Specified by:
        getInMemorySize in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Returns:
        in-memory size in bytes of cache block
      • getExactSerializedSize

        public long getExactSerializedSize()
        Description copied from interface: org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Get the exact serialized size in bytes of the cache block.
        Specified by:
        getExactSerializedSize in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Returns:
        exact serialized size in bytes of cache block
      • isShallowSerialize

        public boolean isShallowSerialize()
        Description copied from interface: org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.
        Specified by:
        isShallowSerialize in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Returns:
        true if shallow serialized
      • compactEmptyBlock

        public void compactEmptyBlock()
        Description copied from interface: org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Free unnecessarily allocated empty block.
        Specified by:
        compactEmptyBlock in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
      • sliceOperations

        public FrameBlock sliceOperations(int rl,
                                          int ru,
                                          int cl,
                                          int cu,
                                          org.apache.sysml.runtime.controlprogram.caching.CacheBlock retCache)
                                   throws DMLRuntimeException
        Right indexing operations to slice a subframe out of this frame block. Note that the existing column value types are preserved.
        Specified by:
        sliceOperations in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Parameters:
        rl - row lower index, inclusive, 0-based
        ru - row upper index, inclusive, 0-based
        cl - column lower index, inclusive, 0-based
        cu - column upper index, inclusive, 0-based
        retCache - cache block
        Returns:
        frame block
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • sliceOperations

        public void sliceOperations(ArrayList<Pair<Long,FrameBlock>> outlist,
                                    org.apache.sysml.runtime.util.IndexRange range,
                                    int rowCut)
      • appendOperations

        public FrameBlock appendOperations(FrameBlock that,
                                           FrameBlock ret,
                                           boolean cbind)
                                    throws DMLRuntimeException
        Appends the given argument frameblock 'that' to this frameblock by creating a deep copy to prevent side effects. For cbind, the frames are appended column-wise (same number of rows), while for rbind the frames are appended row-wise (same number of columns).
        Parameters:
        that - frame block to append to current frame block
        ret - frame block to return, can be null
        cbind - if true, column append
        Returns:
        frame block
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • copy

        public void copy(int rl,
                         int ru,
                         int cl,
                         int cu,
                         FrameBlock src)
      • getRecodeMap

        public HashMap<String,Long> getRecodeMap(int col)
        This function will split every Recode map in the column using delimiter Lop.DATATYPE_PREFIX, as Recode map generated earlier in the form of Code+Lop.DATATYPE_PREFIX+Token and store it in a map which contains token and code for every unique tokens.
        Parameters:
        col - is the column # from frame data which contains Recode map generated earlier.
        Returns:
        map of token and code for every element in the input column of a frame containing Recode map
      • merge

        public void merge(org.apache.sysml.runtime.controlprogram.caching.CacheBlock that,
                          boolean bDummy)
                   throws DMLRuntimeException
        Description copied from interface: org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Merge the given block into the current block. Both blocks needs to be of equal dimensions and contain disjoint non-zero cells.
        Specified by:
        merge in interface org.apache.sysml.runtime.controlprogram.caching.CacheBlock
        Parameters:
        that - cache block
        bDummy - ?
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • zeroOutOperations

        public FrameBlock zeroOutOperations(FrameBlock result,
                                            org.apache.sysml.runtime.util.IndexRange range,
                                            boolean complementary,
                                            int iRowStartSrc,
                                            int iRowStartDest,
                                            int brlen,
                                            int iMaxRowsToCopy)
                                     throws DMLRuntimeException
        This function ZERO OUT the data in the slicing window applicable for this block.
        Parameters:
        result - frame block
        range - index range
        complementary - ?
        iRowStartSrc - ?
        iRowStartDest - ?
        brlen - ?
        iMaxRowsToCopy - ?
        Returns:
        frame block
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs

Copyright © 2017 The Apache Software Foundation. All rights reserved.