org.apache.sysml.runtime.compress

Class ColGroupOffset

  • All Implemented Interfaces:
    Serializable
    Direct Known Subclasses:
    ColGroupOLE, ColGroupRLE


    public abstract class ColGroupOffset
    extends ColGroupValue
    Base class for column groups encoded with various types of bitmap encoding. NOTES: * OLE: separate storage segment length and bitmaps led to a 30% improvement but not applied because more difficult to support both data layouts at the same time (distributed/local as well as w/ and w/o low-level opt)
    See Also:
    Serialized Form
    • Field Detail

      • ALLOW_CACHE_CONSCIOUS_ROWSUMS

        public static boolean ALLOW_CACHE_CONSCIOUS_ROWSUMS
      • _data

        protected char[] _data
      • _zeros

        protected boolean _zeros
      • _skiplist

        protected int[] _skiplist
    • Constructor Detail

      • ColGroupOffset

        public ColGroupOffset()
      • ColGroupOffset

        public ColGroupOffset(int[] colIndices,
                              int numRows,
                              UncompressedBitmap ubm)
        Main constructor. Stores the headers for the individual bitmaps.
        Parameters:
        colIndices - indices (within the block) of the columns included in this column
        numRows - total number of rows in the parent block
        ubm - Uncompressed bitmap representation of the block
      • ColGroupOffset

        protected ColGroupOffset(int[] colIndices,
                                 int numRows,
                                 boolean zeros,
                                 double[] values)
        Constructor for subclass methods that need to create shallow copies
        Parameters:
        colIndices - raw column index information
        numRows - number of rows in the block
        zeros - indicator if column group contains zero values
        values - set of distinct values for the block (associated bitmaps are kept in the subclass)
    • Method Detail

      • len

        protected final int len(int k)
      • createCompressedBitmaps

        protected void createCompressedBitmaps(int numVals,
                                               int totalLen,
                                               char[][] lbitmaps)
      • estimateInMemorySize

        public long estimateInMemorySize()
        Description copied from class: ColGroup
        Note: Must be overridden by child classes to account for additional data and metadata
        Overrides:
        estimateInMemorySize in class ColGroupValue
        Returns:
        an upper bound on the number of bytes used to store this ColGroup in memory.
      • decompressToBlock

        public void decompressToBlock(MatrixBlock target,
                                      int rl,
                                      int ru)
        Description copied from class: ColGroup
        Decompress the contents of this column group into the specified full matrix block.
        Specified by:
        decompressToBlock in class ColGroup
        Parameters:
        target - a matrix block where the columns covered by this column group have not yet been filled in.
        rl - row lower
        ru - row upper
      • decompressToBlock

        public void decompressToBlock(MatrixBlock target,
                                      int[] colIndexTargets)
        Description copied from class: ColGroup
        Decompress the contents of this column group into uncompressed packed columns
        Specified by:
        decompressToBlock in class ColGroup
        Parameters:
        target - a dense matrix block. The block must have enough space to hold the contents of this column group.
        colIndexTargets - array that maps column indices in the original matrix block to columns of target.
      • decompressToBlock

        public void decompressToBlock(MatrixBlock target,
                                      int colpos)
        Description copied from class: ColGroup
        Decompress to block.
        Specified by:
        decompressToBlock in class ColGroup
        Parameters:
        target - dense output vector
        colpos - column to decompress, error if larger or equal numCols
      • get

        public double get(int r,
                          int c)
        Description copied from class: ColGroup
        Get the value at a global row/column position.
        Specified by:
        get in class ColGroup
        Parameters:
        r - row
        c - column
        Returns:
        value at the row/column position
      • sumAllValues

        protected final void sumAllValues(double[] b,
                                          double[] c)
      • mxxValues

        protected final double mxxValues(int bitmapIx,
                                         org.apache.sysml.runtime.functionobjects.Builtin builtin)
      • getBitmaps

        public char[] getBitmaps()
      • getBitmapOffsets

        public int[] getBitmapOffsets()
      • hasZeros

        public boolean hasZeros()
      • getDecodeIterator

        public abstract Iterator<Integer> getDecodeIterator(int k)
        Parameters:
        k - index of a specific compressed bitmap (stored in subclass, index same as ColGroupValue.getValues())
        Returns:
        an object for iterating over the row offsets in this bitmap. Only valid until the next call to this method. May be reused across calls.
      • computeOffsets

        protected int[] computeOffsets(boolean[] ind)
                                throws DMLRuntimeException
        Utility function of sparse-unsafe operations.
        Parameters:
        ind - row indicator vector of non zeros
        Returns:
        offsets
        Throws:
        DMLRuntimeException - if DMLRuntimeException occurs
      • getExactSizeOnDisk

        public long getExactSizeOnDisk()
        Description copied from class: ColGroup
        Returns the exact serialized size of column group. This can be used for example for buffer preallocation.
        Specified by:
        getExactSizeOnDisk in class ColGroup
        Returns:
        exact serialized size for column group
      • unaryAggregateOperations

        public void unaryAggregateOperations(org.apache.sysml.runtime.matrix.operators.AggregateUnaryOperator op,
                                             MatrixBlock result,
                                             int rl,
                                             int ru)
                                      throws DMLRuntimeException
        Specified by:
        unaryAggregateOperations in class ColGroupValue
        Parameters:
        op - aggregation operator
        result - output matrix block
        rl - row lower index, inclusive
        ru - row upper index, exclusive
        Throws:
        DMLRuntimeException - on invalid inputs
      • computeSum

        protected abstract void computeSum(MatrixBlock result,
                                           org.apache.sysml.runtime.functionobjects.KahanFunction kplus)
      • computeRowSums

        protected abstract void computeRowSums(MatrixBlock result,
                                               org.apache.sysml.runtime.functionobjects.KahanFunction kplus,
                                               int rl,
                                               int ru)
      • computeColSums

        protected abstract void computeColSums(MatrixBlock result,
                                               org.apache.sysml.runtime.functionobjects.KahanFunction kplus)
      • computeRowMxx

        protected abstract void computeRowMxx(MatrixBlock result,
                                              org.apache.sysml.runtime.functionobjects.Builtin builtin,
                                              int rl,
                                              int ru)

Copyright © 2017 The Apache Software Foundation. All rights reserved.