Package org.apache.uima.cas.impl
Class CasSerializerSupport.CasDocSerializer
- java.lang.Object
-
- org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
-
- Enclosing class:
- CasSerializerSupport
public class CasSerializerSupport.CasDocSerializer extends java.lang.Object
Use an inner class to hold the data for serializing a CAS. Each call to serialize() creates its own instance. package private to allow a test case to access not static to share the logger and the initializing values (could be changed)
-
-
Field Summary
Fields Modifier and Type Field Description CASImpl
cas
private CasSerializerSupport.CasSerializerSupportSerialize
csss
private java.util.Set<TOP>
enqueued_multiRef_arrays_or_lists
Set of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written.private org.xml.sax.ErrorHandler
errorHandler2
TypeSystemImpl
filterTypeSystem_inner
java.util.List<TOP>[]
indexedFSs
Array of Lists of all FS that are indexed in some view (other than sofas).boolean
isDelta
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified.boolean
isDynamicMultiRef
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and listsboolean
isFiltering
Whether the serializer needs to check for filtered-out types/features.boolean
isFormattedOutput_inner
MarkerImpl
marker
Used to tell if a FS was created before or after mark.java.util.List<TOP>
modifiedEmbeddedValueFSs
java.util.Set<TOP>
multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature.boolean
needNameSpaces
java.util.Set<java.lang.String>
nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back injava.util.Map<java.lang.String,java.lang.String>
nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace stringjava.util.List<TOP>
previouslySerializedFSs
private java.util.Deque<TOP>
queue
FSs not in an index, but only being serialized becaused they're referenced.XmiSerializationSharedData
sharedData
for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serializationprivate TypeImpl[]
sortedUsedTypes
java.util.Comparator<TOP>
sortFssByType
Called for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by idTypeSystemImpl
tsi
XmlElementName[]
typeCode2namespaceNames
private java.util.BitSet
typeUsed
private java.util.Map<java.lang.String,java.lang.String>
uniqueStrings
java.util.Set<TOP>
visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized.
-
Constructor Summary
Constructors Constructor Description CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss)
CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
encodeFS(TOP fs)
Encode an individual FS.private void
encodeFSs(java.util.List<TOP> fss)
void
encodeIndexed()
void
encodeQueued()
(package private) int
enqueueCommon(TOP fs)
private int
enqueueCommon(TOP fs, boolean doDeltaAndFilteringCheck)
(package private) int
enqueueCommonWithoutDeltaAndFilteringCheck(TOP fs)
private void
enqueueFeatures(TOP fs)
Enqueue all FSs reachable from features of the given FS.private void
enqueueFeaturesOfFSs(java.util.List<TOP> fss)
private void
enqueueFeaturesOfIndexed()
Enqueue everything reachable from features of indexed FSs.private void
enqueueFsAndMaybeFeatures(TOP fs)
Enqueue an FS, and everything reachable from it.private void
enqueueFSArrayElements(FSArray fsArray)
Enqueues all FS reachable from an FSArray.private void
enqueueFSListElements(FSList<TOP> node)
Enqueues all Head values of FSList reachable from an FSList.private void
enqueueIncoming()
Enqueues all FS that are stored in the sharedData's id map.private void
enqueueIndexed()
add the indexed FSs onto the indexedFSs by view.(package private) void
enqueueIndexedFs_only_not_features(int viewNumber, TOP fs)
private void
enqueueNonsharedMultivaluedFS()
When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified.(package private) int
getElementCountForSharedData()
java.lang.String
getNameSpacePrefix(java.lang.String uimaTypeName, java.lang.String nsUri, int lastDotIndex)
Sofa
getSofa(int sofaNum)
TypeImpl[]
getSortedUsedTypes()
java.lang.String
getTypeNameFromXmlElementName(XmlElementName xe)
java.lang.String
getUniqueString(java.lang.String s)
private java.lang.Iterable<TypeImpl>
getUsedTypesIterable()
java.lang.String
getXmiId(TOP fs)
Get the XMI ID to use for an FS.int
getXmiIdAsInt(TOP fs)
private boolean
isListElementsMultiplyReferenced(TOP listNode)
For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - tprivate boolean
isMultiRef_enqueue(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat)
ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue".boolean
isStaticMultiRef(FeatureImpl fi)
private void
reportMultiRefWarning(FeatureImpl fi)
void
serialize()
Starts serializationvoid
writeViewsCommons()
-
-
-
Field Detail
-
cas
public final CASImpl cas
-
tsi
public final TypeSystemImpl tsi
-
visited_not_yet_written
public final java.util.Set<TOP> visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. - FSs added to this, during "enqueue" phase, prior to encoding uses: - for Arrays and Lists, used to detect multi-refs - for Lists, used to detect loops - during enqueuing phase, prevent multiple enqueuings - during encoding phase, to prevent multiple encodings Public for use by JsonCasSerializer
-
enqueued_multiRef_arrays_or_lists
private final java.util.Set<TOP> enqueued_multiRef_arrays_or_lists
Set of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written. It has array or list elements where the item needs to be enqueued onto the "queue" list. Use: limit the put-onto-queue list to one time
-
multiRefFSs
public final java.util.Set<TOP> multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature. This is also for xmi, to enable adding to "queue" (once) for each FSs of this kind. Used: - limit the number of times this is put onto the queue to 1. - skip encoding of items on "queue" if not in this Set (maybe not needed? 8/2017 mis) - serialize if not in indexed set, dynamic ref == true, and in this set (otherwise serialize only from ref)
-
isDynamicMultiRef
public final boolean isDynamicMultiRef
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and lists
-
previouslySerializedFSs
public java.util.List<TOP> previouslySerializedFSs
-
modifiedEmbeddedValueFSs
public java.util.List<TOP> modifiedEmbeddedValueFSs
-
indexedFSs
public final java.util.List<TOP>[] indexedFSs
Array of Lists of all FS that are indexed in some view (other than sofas). Array indexed by view.
-
queue
private final java.util.Deque<TOP> queue
FSs not in an index, but only being serialized becaused they're referenced. Exception: the sofa's are here.
-
typeCode2namespaceNames
public XmlElementName[] typeCode2namespaceNames
-
typeUsed
private final java.util.BitSet typeUsed
-
needNameSpaces
public boolean needNameSpaces
-
nsUriToPrefixMap
public final java.util.Map<java.lang.String,java.lang.String> nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string
-
nsPrefixesUsed
public final java.util.Set<java.lang.String> nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in
-
marker
public final MarkerImpl marker
Used to tell if a FS was created before or after mark.
-
sharedData
public final XmiSerializationSharedData sharedData
for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serialization
-
isDelta
public final boolean isDelta
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object.
-
isFiltering
public final boolean isFiltering
Whether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer.
-
sortedUsedTypes
private TypeImpl[] sortedUsedTypes
-
errorHandler2
private final org.xml.sax.ErrorHandler errorHandler2
-
filterTypeSystem_inner
public TypeSystemImpl filterTypeSystem_inner
-
uniqueStrings
private final java.util.Map<java.lang.String,java.lang.String> uniqueStrings
-
isFormattedOutput_inner
public final boolean isFormattedOutput_inner
-
csss
private final CasSerializerSupport.CasSerializerSupportSerialize csss
-
sortFssByType
public final java.util.Comparator<TOP> sortFssByType
Called for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
-
-
Constructor Detail
-
CasDocSerializer
public CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss)
- Parameters:
ch
- -cas
- -sharedData
- -marker
- -csss
- -
-
CasDocSerializer
public CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs)
-
-
Method Detail
-
reportMultiRefWarning
private void reportMultiRefWarning(FeatureImpl fi) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
serialize
public void serialize() throws java.lang.Exception
Starts serialization- Throws:
java.lang.Exception
- -
-
getSofa
public Sofa getSofa(int sofaNum)
- Parameters:
sofaNum
- - starts at 1- Returns:
- the sofa FS, or null
-
writeViewsCommons
public void writeViewsCommons() throws java.lang.Exception
- Throws:
java.lang.Exception
-
getSortedUsedTypes
public TypeImpl[] getSortedUsedTypes()
-
getUsedTypesIterable
private java.lang.Iterable<TypeImpl> getUsedTypesIterable()
-
enqueueIncoming
private void enqueueIncoming()
Enqueues all FS that are stored in the sharedData's id map. This map is populated during the previous deserialization. This method is used to make sure that all incoming FS are echoed in the next serialization. It is required if there are out-of-type FSs that are being merged back into the serialized form; those might reference some of these.
-
enqueueIndexed
private void enqueueIndexed()
add the indexed FSs onto the indexedFSs by view. add the SofaFSs onto the by-ref queue
-
enqueueNonsharedMultivaluedFS
private void enqueueNonsharedMultivaluedFS()
When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified. The embedded nonshared-multivalued item could be a list or an array
-
enqueueFeaturesOfIndexed
private void enqueueFeaturesOfIndexed() throws org.xml.sax.SAXException
Enqueue everything reachable from features of indexed FSs.- Throws:
org.xml.sax.SAXException
-
enqueueFeaturesOfFSs
private void enqueueFeaturesOfFSs(java.util.List<TOP> fss) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
enqueueCommon
int enqueueCommon(TOP fs)
-
enqueueCommonWithoutDeltaAndFilteringCheck
int enqueueCommonWithoutDeltaAndFilteringCheck(TOP fs)
-
enqueueCommon
private int enqueueCommon(TOP fs, boolean doDeltaAndFilteringCheck)
- Parameters:
fs
- -doDeltaAndFilteringCheck
- -- Returns:
- true to have enqueue put onto "queue" and enqueue features
-
enqueueIndexedFs_only_not_features
void enqueueIndexedFs_only_not_features(int viewNumber, TOP fs)
-
enqueueFsAndMaybeFeatures
private void enqueueFsAndMaybeFeatures(TOP fs) throws org.xml.sax.SAXException
Enqueue an FS, and everything reachable from it. This call is recursive with enqueueFeatures, \ and an arbitrary long chain can get stack overflow error. Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106- Parameters:
addr
- The FS address.- Throws:
org.xml.sax.SAXException
-
isListElementsMultiplyReferenced
private boolean isListElementsMultiplyReferenced(TOP listNode)
For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - t- Parameters:
curNode
- -featCode
- -- Returns:
- false if no list element is multiply-referenced, true if there is a loop or another ref from outside the list, for one or more list element nodes
-
isMultiRef_enqueue
private boolean isMultiRef_enqueue(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) throws org.xml.sax.SAXException
ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue". (not handled here: ordinary FSs are serialized in-line in JSON with isDynamicMultiRef)- Parameters:
fi
- - the feature, to look up the multiRefAllowed flagfeatVal
- - the List or array elementalreadyVisited
- true if visited_not_yet_written contains the featValisListNode
- -isListFeat
- -- Returns:
- false if should skip enqueue because this array or list is being serialized inline
- Throws:
org.xml.sax.SAXException
- -
-
enqueueFeatures
private void enqueueFeatures(TOP fs) throws org.xml.sax.SAXException
Enqueue all FSs reachable from features of the given FS.- Parameters:
addr
- address of an FStypeCode
- type of the FSinsideListNode
- true iff the enclosing FS (addr) is a list type- Throws:
org.xml.sax.SAXException
-
enqueueFSArrayElements
private void enqueueFSArrayElements(FSArray fsArray) throws org.xml.sax.SAXException
Enqueues all FS reachable from an FSArray.- Parameters:
addr
- Address of an FSArray- Throws:
org.xml.sax.SAXException
-
enqueueFSListElements
private void enqueueFSListElements(FSList<TOP> node) throws org.xml.sax.SAXException
Enqueues all Head values of FSList reachable from an FSList. This does NOT include the list nodes themselves.- Parameters:
addr
- Address of an FSList- Throws:
org.xml.sax.SAXException
-
encodeIndexed
public void encodeIndexed() throws java.lang.Exception
- Throws:
java.lang.Exception
-
encodeFSs
private void encodeFSs(java.util.List<TOP> fss) throws java.lang.Exception
- Throws:
java.lang.Exception
-
encodeQueued
public void encodeQueued() throws java.lang.Exception
- Throws:
java.lang.Exception
-
encodeFS
public void encodeFS(TOP fs) throws java.lang.Exception
Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.- Parameters:
fs
- the FS to be encoded.- Throws:
org.xml.sax.SAXException
- passthrujava.lang.Exception
-
getElementCountForSharedData
int getElementCountForSharedData()
-
getXmiId
public java.lang.String getXmiId(TOP fs)
Get the XMI ID to use for an FS.- Parameters:
fs
- the FS- Returns:
- XMI ID or null
-
getXmiIdAsInt
public int getXmiIdAsInt(TOP fs)
-
getNameSpacePrefix
public java.lang.String getNameSpacePrefix(java.lang.String uimaTypeName, java.lang.String nsUri, int lastDotIndex)
-
getUniqueString
public java.lang.String getUniqueString(java.lang.String s)
-
getTypeNameFromXmlElementName
public java.lang.String getTypeNameFromXmlElementName(XmlElementName xe)
-
isStaticMultiRef
public boolean isStaticMultiRef(FeatureImpl fi)
-
-