Class ByteBasedPNameTable


  • public final class ByteBasedPNameTable
    extends NameTable
    This is a symbol table implementation used for storing byte-based PNames, specifically, instances of (ByteBasedPName).
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      (package private) static class  ByteBasedPNameTable.Bucket  
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) static int INITIAL_COLLISION_LEN  
      (package private) static int LAST_VALID_BUCKET
      Bucket index is 8 bits, and value 0 is reserved to represent 'empty' status.
      private int mCollCount
      Total number of PNames in collision buckets (included in mCount along with primary entries)
      private int mCollEnd
      Index of the first unused collision bucket entry (== size of the used portion of collision list): less than or equal to 0xFF (255), since max number of entries is 255 (8-bit, minus 0 used as 'empty' marker)
      private ByteBasedPNameTable.Bucket[] mCollList
      Array of heads of collision bucket chains; size dynamically
      private boolean mCollListShared
      Flag that indicates whether underlying data structures for the collision list are shared or not.
      private int mCount
      Total number of PNames in the symbol table
      (package private) static int MIN_HASH_SIZE  
      private int[] mMainHash
      Array of 2^N size, which contains combination of 24-bits of hash (0 to indicate 'empty' slot), and 8-bit collision bucket index (0 to indicate empty collision bucket chain; otherwise subtract one from index)
      private int mMainHashMask
      Mask used to truncate 32-bit hash value to current hash array size; essentially, hash array size - 1 (since hash array sizes are 2^N).
      private boolean mMainHashShared
      Flag that indicates whether underlying data structures for the main hash area are shared or not.
      private ByteBasedPName[] mMainNames
      Array that contains PName instances matching entries in mMainHash.
      private boolean mMainNamesShared  
      private boolean mNeedRehash
      This flag is set if, after adding a new entry, it is deemed that a rehash is warranted if any more entries are to be added.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      ByteBasedPName addSymbol​(int hash, java.lang.String symbolStr, int colonIx, int[] quads, int qlen)  
      ByteBasedPName addSymbol​(int hash, java.lang.String symbolStr, int colonIx, int firstQuad, int secondQuad)  
      static int calcHash​(int firstQuad)  
      static int calcHash​(int[] quads, int qlen)  
      static int calcHash​(int firstQuad, int secondQuad)  
      static int[] calcQuads​(byte[] wordBytes)  
      private void doAddSymbol​(int hash, ByteBasedPName symbol)  
      private void expandCollision()  
      private int findBestBucket()
      Method called to find the best bucket to spill a PName over to: usually the first bucket that has only one entry, but in general first one of the buckets with least number of entries
      ByteBasedPName findSymbol​(int hash, int[] quads, int qlen)
      Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.
      ByteBasedPName findSymbol​(int hash, int firstQuad, int secondQuad)
      Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.
      void markAsShared()  
      boolean maybeDirty()
      Method called to check to quickly see if a child symbol table may have gotten additional entries.
      boolean mergeFromChild​(ByteBasedPNameTable child)  
      void nuke()
      Method used by test code, to reset state of the name table.
      private void rehash()  
      int size()  
      java.lang.String toString()  
      private void unshareCollision()  
      private void unshareMain()
      Method that needs to be called, if the main hash structure is (may be) shared.
      private void unshareNames()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • LAST_VALID_BUCKET

        static final int LAST_VALID_BUCKET
        Bucket index is 8 bits, and value 0 is reserved to represent 'empty' status.
        See Also:
        Constant Field Values
      • mCount

        private int mCount
        Total number of PNames in the symbol table
      • mMainHashMask

        private int mMainHashMask
        Mask used to truncate 32-bit hash value to current hash array size; essentially, hash array size - 1 (since hash array sizes are 2^N).
      • mMainHash

        private int[] mMainHash
        Array of 2^N size, which contains combination of 24-bits of hash (0 to indicate 'empty' slot), and 8-bit collision bucket index (0 to indicate empty collision bucket chain; otherwise subtract one from index)
      • mMainNames

        private ByteBasedPName[] mMainNames
        Array that contains PName instances matching entries in mMainHash. Contains nulls for unused entries.
      • mCollCount

        private int mCollCount
        Total number of PNames in collision buckets (included in mCount along with primary entries)
      • mCollEnd

        private int mCollEnd
        Index of the first unused collision bucket entry (== size of the used portion of collision list): less than or equal to 0xFF (255), since max number of entries is 255 (8-bit, minus 0 used as 'empty' marker)
      • mNeedRehash

        private transient boolean mNeedRehash
        This flag is set if, after adding a new entry, it is deemed that a rehash is warranted if any more entries are to be added.
      • mMainHashShared

        private boolean mMainHashShared
        Flag that indicates whether underlying data structures for the main hash area are shared or not. If they are, then they need to be handled in copy-on-write way, i.e. if they need to be modified, a copy needs to be made first; at this point it will not be shared any more, and can be modified.

        This flag needs to be checked both when adding new main entries, and when adding new collision list queues (i.e. creating a new collision list head entry)

      • mMainNamesShared

        private boolean mMainNamesShared
      • mCollListShared

        private boolean mCollListShared
        Flag that indicates whether underlying data structures for the collision list are shared or not. If they are, then they need to be handled in copy-on-write way, i.e. if they need to be modified, a copy needs to be made first; at this point it will not be shared any more, and can be modified.

        This flag needs to be checked when adding new collision entries.

    • Constructor Detail

      • ByteBasedPNameTable

        public ByteBasedPNameTable​(int hashSize)
      • ByteBasedPNameTable

        ByteBasedPNameTable​(ByteBasedPNameTable parent)
        Constructor used when creating a child instance
    • Method Detail

      • markAsShared

        public void markAsShared()
      • nuke

        public void nuke()
        Method used by test code, to reset state of the name table.
      • size

        public int size()
        Specified by:
        size in class NameTable
      • maybeDirty

        public boolean maybeDirty()
        Method called to check to quickly see if a child symbol table may have gotten additional entries. Used for checking to see if a child table should be merged into shared table.
        Specified by:
        maybeDirty in class NameTable
      • findSymbol

        public ByteBasedPName findSymbol​(int hash,
                                         int firstQuad,
                                         int secondQuad)
        Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.

        Note: separate methods to optimize common case of relatively short element/attribute names (8 or less ascii characters)

        Parameters:
        firstQuad - int32 containing first 4 bytes of the pname; if the whole name less than 4 bytes, padded with zero bytes in front (zero MSBs, ie. right aligned)
        secondQuad - int32 containing bytes 5 through 8 of the pname; if less than 8 bytes, padded with up to 4 zero bytes in front (zero MSBs, ie. right aligned)
        Returns:
        PName matching the symbol passed (or constructed for it)
      • findSymbol

        public ByteBasedPName findSymbol​(int hash,
                                         int[] quads,
                                         int qlen)
        Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.

        Note: this is the general purpose method that can be called for names of any length. However, if name is less than 9 bytes long, it is preferable to call the version optimized for short names.

        Parameters:
        quads - Array of int32s, each of which contain 4 bytes of encoded name
        qlen - Number of int32s, starting from index 0, in quads parameter
        Returns:
        PName matching the symbol passed (or constructed for it)
      • addSymbol

        public ByteBasedPName addSymbol​(int hash,
                                        java.lang.String symbolStr,
                                        int colonIx,
                                        int firstQuad,
                                        int secondQuad)
      • addSymbol

        public ByteBasedPName addSymbol​(int hash,
                                        java.lang.String symbolStr,
                                        int colonIx,
                                        int[] quads,
                                        int qlen)
      • calcHash

        public static final int calcHash​(int firstQuad)
      • calcHash

        public static final int calcHash​(int firstQuad,
                                         int secondQuad)
      • calcHash

        public static final int calcHash​(int[] quads,
                                         int qlen)
      • calcQuads

        public static int[] calcQuads​(byte[] wordBytes)
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • doAddSymbol

        private void doAddSymbol​(int hash,
                                 ByteBasedPName symbol)
      • rehash

        private void rehash()
      • findBestBucket

        private int findBestBucket()
        Method called to find the best bucket to spill a PName over to: usually the first bucket that has only one entry, but in general first one of the buckets with least number of entries
      • unshareMain

        private void unshareMain()
        Method that needs to be called, if the main hash structure is (may be) shared. This happens every time something is added, even if addition is to the collision list (since collision list index comes from lowest 8 bits of the primary hash entry)
      • unshareCollision

        private void unshareCollision()
      • unshareNames

        private void unshareNames()
      • expandCollision

        private void expandCollision()