Class JCasHashMap

  • All Implemented Interfaces:
    java.lang.Iterable<TOP>

    public class JCasHashMap
    extends java.lang.Object
    implements java.lang.Iterable<TOP>
    Version 3 (2016, for Java 8) of map between id's (ints) JCasCover Objects Note: in the general case, the cover object may *not* be a JCas one, but rather the general one This happens if there is no JCas cover object defined for the type. Assumptions: Each addr has a corresponding JCas; it is permitted to "update" an addr with a different JCas cover class (use case: low level APIs changing the type of an object) Table always a power of 2 in size - permits faster hashing Accesses to this table are threadsafe, in order to support read-only CASes being shared by multiple threads. Multiple iterators in different threads could be accessing the map and updating it. Load factor tuning. 2,000,000 random inserts, 50 reps (for JIT) .5 (2x to 4x entries) 99% 5 probes 250 ms .6 (1.67 to 3.3) 99% 6 probes 285 ms .7 (1.43 to 2.86) 99% 8 probes 318 ms .8 (1.25 to 2.5) 99% 11 probes 360 ms version 1 at load factor .5 ran about 570 ms * 1.x (did 2 lookups for fetches if not found,) Has standard get method plus a putIfAbsent method, which if not absent, returns the value gotten. Value is a IntSupplier, not invoked unless absent. Locking: get calls which find the element operate without locking. There is one lock used for reading and updating the table -- not used for reading when item found, only if item not found Strategy: have 1 outer implementation delegating to multiple inner ones number = concurrency level (a power of 2) The hash uses some # of low order bits to address the right inner one. This table is used to hold JCas cover classes for CAS feature structures. There are potentially multiple instances of this table associated with each CAS that is using it, when PEARs are involved.

    The creation of the new JCas cover object can, in turn, run arbitrary user code, which can result in updates to the JCasHashMap which occur before this original update occurs.

    In a multi-threaded environment, multiple threads can do a "get" for the same Feature Structure instance. If it's not in the Map, the correct behavior is:

    one of the threads adds the new element the other threads wait for the one thread to finish adding, and then return the object that the one thread added.

    The implementation works as follows:

    1) The JCasHashMap is split into "n" sub-maps. The number is the number of cores, but grows more slowly as the # of cores > 16. This number can be specified, but this is not currently exposed in the tuning parameters Locking occurs on the sub-maps; the outer method calls are not synchronized 2) The number of sub maps is rounded to a power of 2, to allow the low order bits of the hash of the key to be used to pick the map (via masking). 3) a put: 3a) locks 3b) does a find; if found, updates that, if not, adds new value 3c) unlocks 4) a putIfAbsent: 4a) does a find (not under lock) - if found returns that (not under lock) - note: this has a race condition if another thread is updating / changing that slot -- this only happens for unsupported conditions, so is not checked -- changing the type of an existing FS while another thread is accessing the CAS 4b) if not found, locks 4c) does find again 4d) if found, return that and release lock 4e) if not found, eval the creator form and use to set the value, and release lock Note: if eval of creator form recursively calls this, that's OK because sync locks can be recursively gotten and released in a nested way. 5) get does a find, if found, returns that. - if not, get lock, redo search -- if not resized, start find from last spot. else start from beginning. - if found, return that (release lock). if not found return null (release lock)

    Supports put(pre-computed-value), putIfAbsent(value-to-be-computed, as an IntSupplier) get

    • Field Detail

      • DEFAULT_CONCURRENCY_LEVEL

        static int DEFAULT_CONCURRENCY_LEVEL
        must be a power of 2, > 0 package private for testing not final to allow test case to reset it must not be changed during multi-thread operation
      • initialCapacity

        private final int initialCapacity
      • concurrencyLevel

        private final int concurrencyLevel
      • concurrencyBitmask

        private final int concurrencyBitmask
      • concurrencyLevelBits

        private final int concurrencyLevelBits
      • subMapInitialCapacity

        private final int subMapInitialCapacity
    • Constructor Detail

      • JCasHashMap

        public JCasHashMap​(int capacity)
      • JCasHashMap

        JCasHashMap​(int capacity,
                    int aConcurrencyLevel)
    • Method Detail

      • getDEFAULT_CONCURRENCY_LEVEL

        static int getDEFAULT_CONCURRENCY_LEVEL()
      • setDEFAULT_CONCURRENCY_LEVEL

        static void setDEFAULT_CONCURRENCY_LEVEL​(int dEFAULT_CONCURRENCY_LEVEL)
      • concurrencyLimitedByInitialCapacity

        static boolean concurrencyLimitedByInitialCapacity​(int currentConcurrencyLevel,
                                                           int curMapSize)
        initial capacity (other than testing), is by default (from JCasImpl) is bigger of 256 and cas heap initial size (500,000) / 16 = 31K but users may set it lower in their uima configuration We use the current capacity of the JCasHashMap to set the concurrency limit
        Parameters:
        casCapacity - the capacity
        Returns:
        true if the concurrency is limited, and could increase with reallocation
      • sizeAdjustedConcurrency

        static int sizeAdjustedConcurrency​(int curMapSize)
      • clear

        public void clear()
      • putIfAbsent

        public final TOP putIfAbsent​(int key,
                                     java.util.function.IntFunction<TOP> creator)
      • get

        public final TOP get​(int key)
        Parameters:
        key - -
        Returns:
        the item or null
      • put

        public final TOP put​(TOP value)
        Parameters:
        value - -
        Returns:
        previous value or null
      • put

        public TOP put​(int key,
                       TOP value)
        Parameters:
        value - -
        key - -
        Returns:
        previous value or null
      • hashInt

        public static final int hashInt​(int k1)
      • getCapacities

        int[] getCapacities()
      • getSubSizes

        int[] getSubSizes()
      • getCapacity

        int getCapacity()
      • getApproximateSize

        public int getApproximateSize()
        get the approximate size (subject to multithreading inaccuracies)
        Returns:
        the size
      • showHistogram

        public void showHistogram()
      • getConcurrencyLevel

        public int getConcurrencyLevel()
      • iterator

        public IteratorNvc<TOP> iterator()
        Specified by:
        iterator in interface java.lang.Iterable<TOP>