Package com.fasterxml.aalto.in
Class ByteBasedPNameTable
- java.lang.Object
-
- com.fasterxml.aalto.util.NameTable
-
- com.fasterxml.aalto.in.ByteBasedPNameTable
-
public final class ByteBasedPNameTable extends NameTable
This is a symbol table implementation used for storing byte-basedPNames
, specifically, instances of (ByteBasedPName
).
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
ByteBasedPNameTable.Bucket
-
Field Summary
Fields Modifier and Type Field Description (package private) static int
INITIAL_COLLISION_LEN
(package private) static int
LAST_VALID_BUCKET
Bucket index is 8 bits, and value 0 is reserved to represent 'empty' status.private int
mCollCount
Total number of PNames in collision buckets (included inmCount
along with primary entries)private int
mCollEnd
Index of the first unused collision bucket entry (== size of the used portion of collision list): less than or equal to 0xFF (255), since max number of entries is 255 (8-bit, minus 0 used as 'empty' marker)private ByteBasedPNameTable.Bucket[]
mCollList
Array of heads of collision bucket chains; size dynamicallyprivate boolean
mCollListShared
Flag that indicates whether underlying data structures for the collision list are shared or not.private int
mCount
Total number of PNames in the symbol table(package private) static int
MIN_HASH_SIZE
private int[]
mMainHash
Array of 2^N size, which contains combination of 24-bits of hash (0 to indicate 'empty' slot), and 8-bit collision bucket index (0 to indicate empty collision bucket chain; otherwise subtract one from index)private int
mMainHashMask
Mask used to truncate 32-bit hash value to current hash array size; essentially, hash array size - 1 (since hash array sizes are 2^N).private boolean
mMainHashShared
Flag that indicates whether underlying data structures for the main hash area are shared or not.private ByteBasedPName[]
mMainNames
Array that containsPName
instances matching entries inmMainHash
.private boolean
mMainNamesShared
private boolean
mNeedRehash
This flag is set if, after adding a new entry, it is deemed that a rehash is warranted if any more entries are to be added.
-
Constructor Summary
Constructors Constructor Description ByteBasedPNameTable(int hashSize)
ByteBasedPNameTable(ByteBasedPNameTable parent)
Constructor used when creating a child instance
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description ByteBasedPName
addSymbol(int hash, java.lang.String symbolStr, int colonIx, int[] quads, int qlen)
ByteBasedPName
addSymbol(int hash, java.lang.String symbolStr, int colonIx, int firstQuad, int secondQuad)
static int
calcHash(int firstQuad)
static int
calcHash(int[] quads, int qlen)
static int
calcHash(int firstQuad, int secondQuad)
static int[]
calcQuads(byte[] wordBytes)
private void
doAddSymbol(int hash, ByteBasedPName symbol)
private void
expandCollision()
private int
findBestBucket()
Method called to find the best bucket to spill a PName over to: usually the first bucket that has only one entry, but in general first one of the buckets with least number of entriesByteBasedPName
findSymbol(int hash, int[] quads, int qlen)
Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.ByteBasedPName
findSymbol(int hash, int firstQuad, int secondQuad)
Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.void
markAsShared()
boolean
maybeDirty()
Method called to check to quickly see if a child symbol table may have gotten additional entries.boolean
mergeFromChild(ByteBasedPNameTable child)
void
nuke()
Method used by test code, to reset state of the name table.private void
rehash()
int
size()
java.lang.String
toString()
private void
unshareCollision()
private void
unshareMain()
Method that needs to be called, if the main hash structure is (may be) shared.private void
unshareNames()
-
-
-
Field Detail
-
MIN_HASH_SIZE
static final int MIN_HASH_SIZE
- See Also:
- Constant Field Values
-
INITIAL_COLLISION_LEN
static final int INITIAL_COLLISION_LEN
- See Also:
- Constant Field Values
-
LAST_VALID_BUCKET
static final int LAST_VALID_BUCKET
Bucket index is 8 bits, and value 0 is reserved to represent 'empty' status.- See Also:
- Constant Field Values
-
mCount
private int mCount
Total number of PNames in the symbol table
-
mMainHashMask
private int mMainHashMask
Mask used to truncate 32-bit hash value to current hash array size; essentially, hash array size - 1 (since hash array sizes are 2^N).
-
mMainHash
private int[] mMainHash
Array of 2^N size, which contains combination of 24-bits of hash (0 to indicate 'empty' slot), and 8-bit collision bucket index (0 to indicate empty collision bucket chain; otherwise subtract one from index)
-
mMainNames
private ByteBasedPName[] mMainNames
Array that containsPName
instances matching entries inmMainHash
. Contains nulls for unused entries.
-
mCollList
private ByteBasedPNameTable.Bucket[] mCollList
Array of heads of collision bucket chains; size dynamically
-
mCollCount
private int mCollCount
Total number of PNames in collision buckets (included inmCount
along with primary entries)
-
mCollEnd
private int mCollEnd
Index of the first unused collision bucket entry (== size of the used portion of collision list): less than or equal to 0xFF (255), since max number of entries is 255 (8-bit, minus 0 used as 'empty' marker)
-
mNeedRehash
private transient boolean mNeedRehash
This flag is set if, after adding a new entry, it is deemed that a rehash is warranted if any more entries are to be added.
-
mMainHashShared
private boolean mMainHashShared
Flag that indicates whether underlying data structures for the main hash area are shared or not. If they are, then they need to be handled in copy-on-write way, i.e. if they need to be modified, a copy needs to be made first; at this point it will not be shared any more, and can be modified.This flag needs to be checked both when adding new main entries, and when adding new collision list queues (i.e. creating a new collision list head entry)
-
mMainNamesShared
private boolean mMainNamesShared
-
mCollListShared
private boolean mCollListShared
Flag that indicates whether underlying data structures for the collision list are shared or not. If they are, then they need to be handled in copy-on-write way, i.e. if they need to be modified, a copy needs to be made first; at this point it will not be shared any more, and can be modified.This flag needs to be checked when adding new collision entries.
-
-
Constructor Detail
-
ByteBasedPNameTable
public ByteBasedPNameTable(int hashSize)
-
ByteBasedPNameTable
ByteBasedPNameTable(ByteBasedPNameTable parent)
Constructor used when creating a child instance
-
-
Method Detail
-
mergeFromChild
public boolean mergeFromChild(ByteBasedPNameTable child)
-
markAsShared
public void markAsShared()
-
nuke
public void nuke()
Method used by test code, to reset state of the name table.
-
maybeDirty
public boolean maybeDirty()
Method called to check to quickly see if a child symbol table may have gotten additional entries. Used for checking to see if a child table should be merged into shared table.- Specified by:
maybeDirty
in classNameTable
-
findSymbol
public ByteBasedPName findSymbol(int hash, int firstQuad, int secondQuad)
Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.Note: separate methods to optimize common case of relatively short element/attribute names (8 or less ascii characters)
- Parameters:
firstQuad
- int32 containing first 4 bytes of the pname; if the whole name less than 4 bytes, padded with zero bytes in front (zero MSBs, ie. right aligned)secondQuad
- int32 containing bytes 5 through 8 of the pname; if less than 8 bytes, padded with up to 4 zero bytes in front (zero MSBs, ie. right aligned)- Returns:
- PName matching the symbol passed (or constructed for it)
-
findSymbol
public ByteBasedPName findSymbol(int hash, int[] quads, int qlen)
Finds and returns name matching the specified symbol, if such name already exists in the table; or if not, creates name object, adds to the table, and returns it.Note: this is the general purpose method that can be called for names of any length. However, if name is less than 9 bytes long, it is preferable to call the version optimized for short names.
- Parameters:
quads
- Array of int32s, each of which contain 4 bytes of encoded nameqlen
- Number of int32s, starting from index 0, in quads parameter- Returns:
- PName matching the symbol passed (or constructed for it)
-
addSymbol
public ByteBasedPName addSymbol(int hash, java.lang.String symbolStr, int colonIx, int firstQuad, int secondQuad)
-
addSymbol
public ByteBasedPName addSymbol(int hash, java.lang.String symbolStr, int colonIx, int[] quads, int qlen)
-
calcHash
public static final int calcHash(int firstQuad)
-
calcHash
public static final int calcHash(int firstQuad, int secondQuad)
-
calcHash
public static final int calcHash(int[] quads, int qlen)
-
calcQuads
public static int[] calcQuads(byte[] wordBytes)
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
doAddSymbol
private void doAddSymbol(int hash, ByteBasedPName symbol)
-
rehash
private void rehash()
-
findBestBucket
private int findBestBucket()
Method called to find the best bucket to spill a PName over to: usually the first bucket that has only one entry, but in general first one of the buckets with least number of entries
-
unshareMain
private void unshareMain()
Method that needs to be called, if the main hash structure is (may be) shared. This happens every time something is added, even if addition is to the collision list (since collision list index comes from lowest 8 bits of the primary hash entry)
-
unshareCollision
private void unshareCollision()
-
unshareNames
private void unshareNames()
-
expandCollision
private void expandCollision()
-
-