Package org.apache.uima.cas.impl
These are Internal APIs. Use these APIs at your own risk. APIs in this package are subject to change without notice, even in minor releases. Use of this package is not supported. If you think you have found a bug in this package, please try to reproduce it with the officially supported APIs before reporting it.
Internals documentation
NOTE: This documentation is plain HTML, generated from a WYSIWIG editor "tinymce". The way to work on this: after setting up a small web page with the tinymce (running from a local file), use the Tools - source code to cut/paste between this file's source and that editor.
Java Cover Objects for version 3
The Java Cover Objects are no longer cover objects; instead, these objects are the Feature Structures. The Java classes for these objects are in a hierarchy that corresponds to the UIMA type hierarchy. JCasGen continues to serve to generate (for user, not for built-in types) particular Java Classes for particular UIMA Types. And, as before, JCasGen'd classes are optional. If there was not a JCasGen'd class for "MyType" (assume a subtype of "Annotation"), then the most specific supertype of "MyType" which has a particular corresponding Java cover class, is used. (This is how it works in V2, also).
There is one definition of these objects per UIMA Type System. Support for PEARs having different "customizations" of the same JCas classname is not supported in v3.
- This loss of capability is mitigated by the addition of more kinds of Java types as built-in values.
- The reason for this not being supported is that there's no solution figured out for sharing types between the outer and PEAR pipelines, without encountering class-cast exceptions.
- The PEAR can still define customizations for types only it defines (that is, not used by the outer pipeline).
Much of the infrastructure is kept as-is in version 3 to support backwards compatibility.
Format of a JCas class version 3
The _Type is not used. May revisit this if users are using the low-level access made possible by _Type.
There is one definition of the class per type system. Type systems are often shared among multiple CASes. Each definition is loaded under a specific loader for that type system.
(Not implemented) The loader is set up to delegate to the parent for all classes except the JCas types, and for those, it generates them using ASM byte code generation from the fully merged TypeSystem information and existing "customizations".
Each feature is stored in one of two arrays, kept per Java Object Feature Structure Instance: an "int" array, holding boolean/byte/short/int/long/float/double values, and a "Object" array holding strings/refs-to-other-FSs. Longs and Doubles take 2 int slots.
Built-in arrays have their array parts represented by native Java Arrays. Getters and Setters are provided as before. Constructors are provided as before.
Extra fields in the Feature Structure include both instance and class fields:
- (static class fields) a set of fields representing the int offset in the "int" and "object" arrays for all the features
- (instance field) a reference to the TypeImpl for this class - initialized by a reference to a TypeSystemImpl thread local value, at load time. This is updatable to handle two edge cases.
- (instance field) a reference to the CAS View used when this feature structure was created
Extra methods in the FeatureStructure
- a set of generic getters and setters, one per incompatible value type.
- All references to non-primitive FeatureStructures values are collapsed into a single TOP ref.
- These are used for generic access, including serialization/deserialization
- more: see package.html for uimaj-tools jcasgen (link only works if all sources checked out)
UIMA Indexes
Indexes are defined for a pipeline, and are kept as part of the general CAS definition.
Each CAS View has its own instantiation of the defined indexes (there's one definition for all views), and as a result, a particular FS may be added-to-indexes and indexed in some views, and not in others.
There are 3 kinds of indexes: Sorted, Set, and Bag. The basic object type for an index is FsIndex_singleType
. This has 3 subtypes, one for each of the index types:
- FsIndex_bag
- FsIndex_set_sorted (used for both Sets and Sorted indexes
- FsIndex_flat (used for flattened indexes, for instance, with snapshot iterators)
The FsIndex_singleType index is just for one type (and doesn't include entries for any subtypes).
The Set and Sorted implementations are combined; the only difference is in the comparator used. For sets, the comparator is what the index definition specifies. For sorted, the specified comparator is augmented with an least significant extra key which is the Feature Structure id.
Indexes are connected to specific index definitions; these definitions include a type which is the top type for elements of this index. The index definition logically includes that type and all of its subtypes.
An additional data struction, the IndexIteratorCachePair, is associated with each index definition. It holds references to the subtype FsIndex_singleType implementations for all subtypes of an index; this list is created lazily, only when an iterator is created over this index at a particular type level (which can be the type the index was defined for, or any subtype). This lazy aspect is important, because UIMA is often used in cases where there's a giant type system, with lots of subtypes, only a few of which are used in a particular pipeline instance.
There are two tasks that indexes accomplish:
- updating the index with adds and removes of FSs. This update operation is optimized by
- keeping each type indexed separately, so only that data structure for the particular type need be updated (this design choice has a cost in iteration, though)
- treating more common use cases efficiently - the main one being that of adding something "to the end" of the items in the index.
- iterating over an index for a type and its subtypes.
- For indexes having no subtypes, this is done by iterating over the FSLeafIndexImpl for that index and type.
- For indexing with subtypes, this is done by creating individual iterators for the type and all of its subtypes, each iterating over the FSLeafIndexImpl for that type. These iterators are then logically combined into one iterator.
Iterators
There are two main kinds of iterators:
- Iterators over UIMA Indexes
- Iterators over other UIMA objects, such as Views, or internal structures.
Iterators over UIMA indexes
There are two main kinds of iterators over UIMA indexes:
- those returning Java cover objects representing the FS.
- those returning int values representing the location of the FS in the heap. These are the so-called low level iterators; they are less efficient in V3.
The basic iterator over a single type is implemented by FsIterator_singletype. This has subtypes FsIterator_bag and FsIterator_set_sorted.
-
Interface Summary Interface Description AnnotationBaseImpl Deprecated. use AnnotationBase insteadAnnotationImpl Deprecated. use Annotation insteadBooleanArrayFSImpl Deprecated. use BooleanArray insteadByteArrayFSImpl Deprecated. use ByteArray insteadCommonArrayFSImpl Deprecated. CopyOnWriteIndexPart<T extends FeatureStructure> common APIs supporting the copy on write aspect of index partsDoubleArrayFSImpl Deprecated. use DoubleArray insteadFeatureStructureImpl Deprecated. use TOP insteadFloatArrayFSImpl Deprecated. use FloatArray insteadFSComparator UNUSED V3 backwards compat only Delete REplace with Comparator<FeatureStructure> or the like.FSGenerator<T extends FeatureStructure> Deprecated. unused in v3, only present to avoid compile errors in unused v2 classesFsGenerator3 A Functional Interface for generating V3 Java Feature StructuresFsGeneratorArray A Functional Interface for generating Java Feature Structures NO LONGER USEDFSImplComparator UNUSED V3, backwards compat only Interface to compare two feature structures, represented by their addresses.FSRefIterator IntArrayFSImpl Deprecated. use IntegerArray insteadLongArrayFSImpl Deprecated. use LongArray insteadLowLevelCAS Defines the low-level CAS APIs.LowLevelIndex<T extends FeatureStructure> Low-level FS index object.LowLevelIndexRepository Low-level index repository access.LowLevelIterator<T extends FeatureStructure> Low-level FS iterator.LowLevelTypeSystem Low-level version of the type system APIs.ShortArrayFSImpl Deprecated. use ShortArray insteadSlotKindsConstants Users "implement" this interface to get access to these constants in their codeSofaFSImpl Deprecated. use Sofa insteadStringArrayFSImpl Deprecated. use StringArray insteadStringMap Appears to be unused, 1-2015 schorTypeSystemConstants This interface defines static final constants for Type Systems For the built-in types and features: - the type and feature codes - the adjOffsetsXMLTypeSystemConsts Class comment for XMLTypeSystemConsts.java goes here. -
Class Summary Class Description AllFSs support for collecting all FSs in a CAS over all views both indexed, and (optionally) reachableAnnotationTreeImpl<T extends AnnotationFS> Implementation of annotation tree.AnnotationTreeNodeImpl<T extends AnnotationFS> ArrayElement W A R N I N G Not an Inner Class ! !BinaryCasSerDes Binary (mostly non compressed) CAS deserialization The methods in this class were originally part of the CASImpl, and were moved here to this class for v3 Binary non compressed CAS serialization is in class CASSerializer, but that class uses routines and data structures in this class.BinaryCasSerDes4 User callable serialization and deserialization of the CAS in a compressed Binary Format This serializes/deserializes the state of the CAS, assuming that the type information remains constant.BinaryCasSerDes6 User callable serialization and deserialization of the CAS in a compressed Binary Format This serializes/deserializes the state of the CAS.BinaryCasSerDes6.ReuseInfo Info reused for 1) multiple serializations of same cas to multiple targets (a speedup), or 2) for delta cas serialization, where it represents the fsStartIndex info before any mods were done which could change that info, or 3) for deserializing with a delta cas, where it represents the fsStartIndex info at the time the CAS was serialized out..BooleanConstraint Implementation of boolean match constraint.BuiltinTypeKinds Constants representing Built in type collections String Sets: creatableArrays primitiveTypeNames == noncreatable primitives creatableBuiltinJcas (e.g.ByteHeap the v2 CAS byte aux heap - used in modeling some binary (de)serializationCasCompare Used by tests for Binary Compressed de/serialization code.CasCompare.FeatLists CasCompare.Prev hold info about previous compares, to break cycles in references The comparison records cycles and can distinguish different cyclic graphs.CasCompare.ScsKey key for StringCongruenceSetCASCompleteSerializer This is a small object which contains - CASMgrSerializer instance - a Java serializable form of the type system + index definitions - CASSerializer instance - a Java serializable form of the CAS including lists of which FSs are indexedCASImpl Implements the CAS interfaces.CASImpl.FsChange Journaling changes for computing delta cas.CASImpl.MeasureSwitchType CASImpl.SharedViewData CASImpl.SwitchControl Instances are put into a Stack, to remember previous state to switch back to, when switching class loaders and locking the CAS https://issues.apache.org/jira/browse/UIMA-6057CASMgrSerializer Container for serialized CAS typing information.CasSeqAddrMaps Used by Binary serialization form 4 and 6 Manage the conversion of FSs to relative sequential index number, and back Manage the difference in two type systems both size of the FSs and handling excluded types During serialization, these maps are constructed before serialization.CASSerializer This object has 2 purposes.CASSerializer.AddrPlusValue CasSerializerSupport CAS serializer support for XMI and JSON formats.CasSerializerSupport.CasSerializerSupportSerialize CasTypeSystemMapper This class gets initialized with two type systems, and then provides resources to map type and feature codes between them.CommonAuxHeap Encapsulate 8, 16, and 64 bit storage for the CAS.CommonSerDes Common de/serializationCommonSerDes.Header HEADERS Serialization versioning There are 1 or 2 words used for versioning.CommonSerDes.Reading byte swapping reads of integer formsCommonSerDesSequential Common de/serialization for plain binary and compressed binary form 4 which both used to walk the cas using the sequential, incrementing id approach Lifecycle: There is 0/1 instance per CAS, representing the FSs at some point in time in that CAS.ConjunctiveConstraint Implements a conjunctive constraint.ConstraintFactoryImpl Implementation of the ConstraintFactory interface.DebugFSLogicalStructure DebugFSLogicalStructure.IndexInfo Class holding information about an FSIndex Includes the "label" of the index, and a ref to the CAS this index contents are in.DebugFSLogicalStructure.UnexpandedFeatureStructures Class for holding unexpanded feature structuresDebugFSLogicalStructure.ViewInfo Class holding info about a View/Sofa.DebugNameValuePair DeferredIndexUpdates for XCAS and XMI deserialization, need to remember what's being added to the indexes and/or removed, because the actual FSs are not yet "fixed up" (adjusted for reference id's → actual addresses, including the sofa refs) for non-delta updates.DisjunctiveConstraint Implements a disjunctive constraint.EmbeddedConstraint Implement an embedded constraint.FeatureImpl The implementation of features in the type system.FeatureImpl_jcas_only The implementation of jcas-only features in the type system.FeaturePathImpl Implementation of the feature path interface.FeatureStructureImplC Feature structure implementation (for non JCas and JCas) Each FS has - int data - used for boolean, byte, short, int, long, float, double data -- long and double use 2 int slots - may be null if all slots are in JCas cover objects as fields - ref data - used for references to other Java objects, such as -- strings -- other feature structures -- arbitrary Java Objects - may be null if all slots are in JCas cover objects as fields - an id: an incrementing integer, starting at 1, per CAS, of all FSs created for that CAS - a ref to the casView where this FS was created - a ref to the TypeImpl for this class -- can't be static - may be multiple type systems in useFeatureStructureImplC.PrintReferences FeatureValuePathImpl Contains CAS Type and Feature objects to represent a feature path of the form feature1/.../featureN.FilteredIterator<T extends FeatureStructure> Implements a filtered iterator.FloatConstraint Implement an embedded float constraint.FSBooleanConstraintImpl See interface for documentation.FSClassRegistry There is one **class** instance of this per UIMA core class loader.FSClassRegistry.ErrorReport FSClassRegistry.JCasClassFeatureInfo Information about all features this JCas class defines Used to expand the type system when the JCas defines more features than the type system declares.FSClassRegistry.JCasClassInfo One instance per JCas class defined for it, per class loader - per class loader, because different JCas class definitions for the same name are possible, per class loader Kept in maps, per class loader.FSData W A R N I N G Not an Inner Class ! !FSFloatConstraintImpl Implement the FSFloatConstraint interface.FsIndex_annotation<T extends AnnotationFS> Implementation of annotation indexes.FsIndex_bag<T extends FeatureStructure> Used for UIMA FS Bag Indexes Uses ObjHashSet to hold instances of FeatureStructuresFsIndex_flat<T extends FeatureStructure> Common part of flattened indexes, used for both snapshot iterators and flattened sorted indexes built from passed in instance of FsIndex_iicpFsIndex_iicp<T extends FeatureStructure> FsIndex_iicp (iicp) A pair of an leaf index and an iterator cache.FsIndex_set_sorted<T extends FeatureStructure> Common index impl for set and sorted indexes.FsIndex_singletype<T extends FeatureStructure> The common (among all index kinds - set, sorted, bag) info for an index over 1 type (excluding subtypes) SubClasses FsIndex_bag, FsIndex_flat, FsIndex_set_sorted, define the actual index repository for each kind.FsIndex_snapshot<T extends FeatureStructure> Implementation of light-weight wrapper of normal indexes, which support special kinds of iterators base on the setting of IteratorExtraFunctionFSIndexComparatorImpl Specifies the comparison to be used for an index, in terms of - the keys and the typeorder, in an order - the standard/reverse orderingFSIndexRepositoryImpl There is one instance of this class per CAS View.FSIndexRepositoryImpl.IndexesForType Information about all the indexes for a single type.FSIndexRepositoryImpl.ProcessedIndexInfo For processing index updates in batch mode when deserializing from a remote service; lists of FSs that were added, removed, or reindexed only used when processing updates in batch modeFSIndexRepositoryImpl.SharedIndexInfo Information about indexes that is shared across all views *FSIntConstraintImpl Implement the FSIntConstraint interface.FsIterator_aggregation_common<T extends FeatureStructure> Aggregate several FS iterators.FsIterator_backwards<T extends FeatureStructure> Wraps FSIterator, runs it backwards FsIterator_bag<T extends FeatureStructure> FsIterator_bag_pear<T extends FeatureStructure> This version of the FsIterator is used while iterating within a PEAR Indexes keep references to the base (possibly non-pear) version of FSs.FsIterator_limited<T extends FeatureStructure> Wraps FSIterator, limits results to n gets. FsIterator_multiple_indexes<T extends FeatureStructure> Common code for both aggregation of indexes (e.g.FsIterator_set_sorted_pear<T extends FeatureStructure> FsIterator_set_sorted2<T extends FeatureStructure> An iterator for a single type for a set or sorted index NOTE: This is the version used for set/sorted iterators It is built directly on top of a CopyOnWrite wrapper for OrderedFsSet_array It uses the version of OrdereFsSet_array that has no embedded nullsFsIterator_singletype<T extends FeatureStructure> FsIterator_subtypes_ordered<T extends FeatureStructure> Performs an ordered iteration among a set of iterators, each one corresponding to the type or subtype of the uppermost type.FsIterator_subtypes_snapshot<T extends FeatureStructure> FSIteratorImplBase<T extends FeatureStructure> Version 2 compatibility only, not used internally in version 3 Base class for FSIterator implementations.FSsTobeAddedback Record information on what was removed, from which view, and (optionally) how many times.FSsTobeAddedback.FSsTobeAddedbackMultiple Version of this class used for protect blocks - where multiple FSs may be removed.FSsTobeAddedback.FSsTobeAddedbackSingle Version of this class for recording 1 FSFSStringConstraintImpl Implement the FSStringConstraint interface.FSTypeConstraintImpl An implementation of the type constraint interface.Heap the v2 CAS heap - used in modeling some binary (de)serializationId2FS A map from ints representing FS id's (or "addresses") to those FSs There is one map instance per CAS (all views).Id2FS.MeasureCaller IntConstraint Implement an embedded int constraint.LinearTypeOrderBuilderImpl Implementation of theLinearTypeOrderBuilder
interface.LinearTypeOrderBuilderImpl.TotalTypeOrder An implementation of theLinearTypeOrder
interface.LLUnambiguousIteratorImpl<T extends FeatureStructure> Implements a low level ambiguous or unambiguous iterator over some type T which doesn't need to be a subtype of Annotation.LongHeap the v2 CAS long aux heap - used in modeling some binary (de)serializationLongSet Sets of long values, used to support ll_set/getIntValue that manipulate v2 style long dataLowLevelIterator_empty<T extends FeatureStructure> An empty Low-level FS iteratorMarkerImpl A MarkerImpl holds a high-water "mark" in the CAS, for all views.MethodHandlesLookup OutOfTypeSystemData This class is used by the XCASDeserializer to store feature structures that do not fit into the type system of the CAS it is deserializing into.PathConstraint Implements a constraint embedded under a path.SelectFSs_impl<T extends FeatureStructure> Collection of builder style methods to specify selection of FSs from indexes shift handled in this routine Comment codes: AI = implies AnnotationIndex Iterator varieties and impl bounded? type order not unambig? strict? skipEq Priority? Needed? no coveredBy covering sameas for not-bounded, - ignore strict and skipEq -- except: preceding implies skipping annotations whose end > positioning begin - order-not-needed only applies if iicp size > 1 - unambig ==> use Subiterator -- subiterator wraps: according to typePriority and order-not-needed - no Type Priority - need to pass in as arg to fsIterator_multiple_indexes == if no type priority, need to prevent rattling off the == type while compare is equal == affects both FsIterator_aggregation_common and FsIterator_subtypes_ordered for 3 other boundings: - use subiterator, pass in strict and skipeq finish this javadoc comment edit T extends FeatureStructure, not TOP, because of ref from FSIndex which uses FeatureStructure for backwards compatibilitySerialization This class has no fields or instance methods, but instead has only static methods.ShortHeap the v2 CAS short aux heap - used in modeling some binary (de)serializationSlotKinds NOTE: adding or altering slots breaks backward compatability and the ability do deserialize previously serialized things This definition shared with BinaryCasSerDes4 Define all the slot kinds.StringConstraint Implement an embedded String constraint.StringHeap Encapsulate string storage for the CAS.StringHeapDeserializationHelper Support for legacy string heap format.StringSet Like string heap, but keeps strings in a hashmap (for quick testing) and an array list.Subiterator<T extends AnnotationFS> Subiterator implementation.TypeImpl The implementation of types in the type system.TypeImpl_annot A version of TypeImpl for Annotations and subtypes of AnnotationsTypeImpl_annotBase A version of TypeImpl for the AnnotationBase type and its subtypesTypeImpl_array TypeImpl_list TypeImpl_primitive TypeImpl_string String or String SubtypeTypeImpl_stringSubtype TypeNameSpaceImpl TypeSystem2Xml Dumps a Type System object to XML.TypeSystemImpl Type system implementation.TypeSystemUtils Type Utilities - all static, so class is abstract to prevent creation Used by Feature PathTypeSystemUtils.FeatureParse TypeSystemUtils.NameSpaceParse TypeSystemUtils.ParsingError TypeSystemUtils.TypeParse TypeSystemUtils.TypeSystemParse XCASDeserializer XCAS Deserializer.XCASDeserializer.FSInfo Feature Structure plus all the indexes it is indexed in indexRep -> indexMap -> indexRepositories -> indexRepository or indexRep -> indexRepositories -> indexRepository (2nd if indexMap size == 1)XCASSerializer XCAS serializer.XmiCasDeserializer XMI CAS deserializer.XmiCasSerializer CAS serializer for XMI format; writes a CAS in the XML Metadata Interchange (XMI) format.XmiSerializationSharedData A container for data that is shared between theXmiCasSerializer
and theXmiCasDeserializer
.XmiSerializationSharedData.NameMultiValue XmiSerializationSharedData.OotsElementData Data structure holding all information about an XMI element containing an out-of-typesystem FS.XmiSerializationSharedData.XmiArrayElement Data structure holding the index and the xmi:id of an array or list element that is a reference to an out-of-typesystem FS. -
Enum Summary Enum Description AllowPreexistingFS BinaryCasSerDes4.Compression BinaryCasSerDes4.CompressLevel Compression alternativesBinaryCasSerDes4.CompressStrat BinaryCasSerDes6.CompressLevel Compression alternativesBinaryCasSerDes6.CompressStrat CasState states the CAS can be inSlotKinds.SlotKind Subiterator.BoundsUse TypeSystemUtils.PathValid -
Exception Summary Exception Description AnnotationImplException Exception class for package org.apache.uima.cas.impl.LowLevelException Exception class for package org.apache.uima.cas.impl.XCASParsingException Exception class for package org.apache.uima.cas.impl.