Package org.apache.uima.cas.impl
Class BinaryCasSerDes4.Serializer
- java.lang.Object
-
- org.apache.uima.cas.impl.BinaryCasSerDes4.Serializer
-
- Enclosing class:
- BinaryCasSerDes4
private class BinaryCasSerDes4.Serializer extends java.lang.Object
Class instantiated once per serialization Multiple serializations in parallel supported, with multiple instances of this
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
BinaryCasSerDes4.Serializer.SerializeModifiedFSs
-
Field Summary
Fields Modifier and Type Field Description private java.io.ByteArrayOutputStream[]
baosZipSources
private CASImpl
baseCas
private BinaryCasSerDes
bcsd
private java.io.DataOutputStream
byte_dos
private BinaryCasSerDes4.CompressLevel
compressLevel
private BinaryCasSerDes4.CompressStrat
compressStrategy
private java.io.DataOutputStream
control_dos
private CommonSerDesSequential
csds
private boolean
doMeasurement
private java.io.DataOutputStream[]
dosZipSources
private java.io.DataOutputStream
double_Exponent_dos
private java.io.DataOutputStream
double_Mantissa_Sign_dos
private java.io.DataOutputStream
float_Exponent_dos
private java.io.DataOutputStream
float_Mantissa_Sign_dos
private Obj2IntIdentityHashMap<TOP>
fs2seq
convert between FSs and "sequential" numbers This is for compression efficiency and also is needed for backwards compatibility with v2 serialization forms, where index information was written using "sequential" numbers Note: This may be identity map, but may not in the case for V3 where some FSs are GC'd Contrast with fs2addr and addr2fs in csds - these use the pseudo v2 addresses as the intprivate java.io.DataOutputStream
fsIndexes_dos
private int
heapEnd
end of heap, in v2 pseudo-addr coordinates = addr of last + length of lastprivate int
heapStart
start of heap, in v2 pseudo-addr coordinatesprivate boolean
isDelta
private boolean
isTsi
private MarkerImpl
mark
private boolean
only1CommonString
private OptimizeStrings
os
private TOP
prevFs
private TOP[]
prevFsByType
For differencing when reading and writing.private java.io.DataOutputStream
serializedOut
private SerializationMeasures
sm
private java.io.DataOutputStream
strLength_dos
private java.io.DataOutputStream
strOffset_dos
private java.io.DataOutputStream
strSeg_dos
private java.io.DataOutputStream
typeCode_dos
private PositiveIntSet
uimaSerializableSavedToCas
Set of FSes on which UimaSerializable _save_to_cas_data has already been called.
-
Constructor Summary
Constructors Modifier Constructor Description private
Serializer(CASImpl cas, java.io.DataOutputStream serializedOut, MarkerImpl mark, SerializationMeasures sm, BinaryCasSerDes4.CompressLevel compressLevel, BinaryCasSerDes4.CompressStrat compressStrategy, boolean isTsi)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
collectAndZip()
Method: write with deflation into a single byte array stream skip if not worth deflating skip the Slot_Control stream record in the Slot_Control stream, for each deflated stream: the Slot index the number of compressed bytes the number of uncompressed bytes add to header: nbr of compressed entries the Slot_Control stream size the Slot_Control stream all the zipped streamsprivate int
compressFsxPart(int[] fsIndexes, int fsNdxStart, CommonSerDesSequential csds)
private int
encodeIntSign(int v)
private void
extractStrings(TOP fs)
add strings to the optimizestrings object If delta, only process for fs's that are new; modified string values picked up when scanning FsChange itemsprivate void
extractStringsFromModifications(CASImpl.FsChange fsChange)
For delta, for each fsChange element, extract any stringsprivate int
fs2seq(TOP fs)
private int
getPrevArray0HeapRef()
private int
getPrevArray0Int()
private boolean
isNoPrevArrayValue(CommonArrayFS prevCommonArray)
private void
serialize()
Form 4 serialization is tied to the layout of V2 Feature Structures in heaps.private void
serializeArray(TOP fs)
private int
serializeArrayLength(TOP fs)
private void
serializeByKind(TOP fs, FeatureImpl feat)
private void
serializeIndexedFeatureStructures(CommonSerDesSequential csds)
private void
writeDiff(int kind, int v, int prev)
Encoding: bit 6 = sign: 1 = negative bit 7 = delta: 1 = deltaprivate void
writeDouble(long raw)
private void
writeFloat(int raw)
Need to support NAN sets, 0x7fc....private void
writeFs(TOP fs)
private void
writeLong(long v, long prev)
private void
writeString(java.lang.String s)
String encoding Length = 0 - used for null, no offset written Length = 1 - used for "", no offset written Length > 0 (subtract 1): used for actual string length Length < 0 - use (-length) as slot index (minimum is 1, slot 0 is NULL) For length > 0, write also the offset.private void
writeStringInfo()
Write the compressed string table(s)private void
writeUnsignedByte(java.io.DataOutputStream s, int v)
private void
writeVnumber(int kind, int v)
private void
writeVnumber(int kind, long v)
private void
writeVnumber(java.io.DataOutputStream s, int v)
private void
writeVnumber(java.io.DataOutputStream s, long v)
-
-
-
Field Detail
-
serializedOut
private final java.io.DataOutputStream serializedOut
-
baseCas
private final CASImpl baseCas
-
bcsd
private final BinaryCasSerDes bcsd
-
mark
private final MarkerImpl mark
-
sm
private final SerializationMeasures sm
-
baosZipSources
private final java.io.ByteArrayOutputStream[] baosZipSources
-
dosZipSources
private final java.io.DataOutputStream[] dosZipSources
-
heapStart
private int heapStart
start of heap, in v2 pseudo-addr coordinates
-
heapEnd
private int heapEnd
end of heap, in v2 pseudo-addr coordinates = addr of last + length of last
-
isDelta
private final boolean isDelta
-
isTsi
private final boolean isTsi
-
doMeasurement
private final boolean doMeasurement
-
os
private final OptimizeStrings os
-
compressLevel
private final BinaryCasSerDes4.CompressLevel compressLevel
-
compressStrategy
private final BinaryCasSerDes4.CompressStrat compressStrategy
-
prevFsByType
private final TOP[] prevFsByType
For differencing when reading and writing. Also used for arrays to difference the 0th element.
-
prevFs
private TOP prevFs
-
only1CommonString
private boolean only1CommonString
-
byte_dos
private final java.io.DataOutputStream byte_dos
-
typeCode_dos
private final java.io.DataOutputStream typeCode_dos
-
strOffset_dos
private final java.io.DataOutputStream strOffset_dos
-
strLength_dos
private final java.io.DataOutputStream strLength_dos
-
float_Mantissa_Sign_dos
private final java.io.DataOutputStream float_Mantissa_Sign_dos
-
float_Exponent_dos
private final java.io.DataOutputStream float_Exponent_dos
-
double_Mantissa_Sign_dos
private final java.io.DataOutputStream double_Mantissa_Sign_dos
-
double_Exponent_dos
private final java.io.DataOutputStream double_Exponent_dos
-
fsIndexes_dos
private final java.io.DataOutputStream fsIndexes_dos
-
control_dos
private final java.io.DataOutputStream control_dos
-
strSeg_dos
private final java.io.DataOutputStream strSeg_dos
-
csds
private final CommonSerDesSequential csds
-
fs2seq
private final Obj2IntIdentityHashMap<TOP> fs2seq
convert between FSs and "sequential" numbers This is for compression efficiency and also is needed for backwards compatibility with v2 serialization forms, where index information was written using "sequential" numbers Note: This may be identity map, but may not in the case for V3 where some FSs are GC'd Contrast with fs2addr and addr2fs in csds - these use the pseudo v2 addresses as the int
-
uimaSerializableSavedToCas
private PositiveIntSet uimaSerializableSavedToCas
Set of FSes on which UimaSerializable _save_to_cas_data has already been called.
-
-
Constructor Detail
-
Serializer
private Serializer(CASImpl cas, java.io.DataOutputStream serializedOut, MarkerImpl mark, SerializationMeasures sm, BinaryCasSerDes4.CompressLevel compressLevel, BinaryCasSerDes4.CompressStrat compressStrategy, boolean isTsi)
- Parameters:
cas
- -serializedOut
- -mark
- -sm
- -compressLevel
- -compressStrategy
- -
-
-
Method Detail
-
serialize
private void serialize() throws java.io.IOException
Form 4 serialization is tied to the layout of V2 Feature Structures in heaps. It does not walk the indexes to serialize just those FSs that are reachable. For V3, it scans the CASImpl.id2fs information and serializes those (except those which have been GC'd). The seq numbers of the target incrementing sequentially will be different from the source id's if some FSs were GC'd. To determine for delta what new strings and new- Throws:
java.io.IOException
-
writeStringInfo
private void writeStringInfo() throws java.io.IOException
Write the compressed string table(s)- Throws:
java.io.IOException
-
writeFs
private void writeFs(TOP fs) throws java.io.IOException
- Throws:
java.io.IOException
-
serializeIndexedFeatureStructures
private void serializeIndexedFeatureStructures(CommonSerDesSequential csds) throws java.io.IOException
- Throws:
java.io.IOException
-
compressFsxPart
private int compressFsxPart(int[] fsIndexes, int fsNdxStart, CommonSerDesSequential csds) throws java.io.IOException
- Throws:
java.io.IOException
-
serializeArray
private void serializeArray(TOP fs) throws java.io.IOException
- Throws:
java.io.IOException
-
getPrevArray0HeapRef
private int getPrevArray0HeapRef()
-
getPrevArray0Int
private int getPrevArray0Int()
-
isNoPrevArrayValue
private boolean isNoPrevArrayValue(CommonArrayFS prevCommonArray)
-
serializeByKind
private void serializeByKind(TOP fs, FeatureImpl feat) throws java.io.IOException
- Throws:
java.io.IOException
-
serializeArrayLength
private int serializeArrayLength(TOP fs) throws java.io.IOException
- Throws:
java.io.IOException
-
collectAndZip
private void collectAndZip() throws java.io.IOException
Method: write with deflation into a single byte array stream skip if not worth deflating skip the Slot_Control stream record in the Slot_Control stream, for each deflated stream: the Slot index the number of compressed bytes the number of uncompressed bytes add to header: nbr of compressed entries the Slot_Control stream size the Slot_Control stream all the zipped streams- Throws:
java.io.IOException
- passthru
-
writeLong
private void writeLong(long v, long prev) throws java.io.IOException
- Throws:
java.io.IOException
-
writeString
private void writeString(java.lang.String s) throws java.io.IOException
String encoding Length = 0 - used for null, no offset written Length = 1 - used for "", no offset written Length > 0 (subtract 1): used for actual string length Length < 0 - use (-length) as slot index (minimum is 1, slot 0 is NULL) For length > 0, write also the offset.- Throws:
java.io.IOException
- passthru
-
writeFloat
private void writeFloat(int raw) throws java.io.IOException
Need to support NAN sets, 0x7fc.... for NAN 0xff8.... for NAN, negative infinity 0x7f8 for NAN, positive infinity Because 0 occurs frequently, we reserve exp of 0 for the value 0- Parameters:
raw
- the number to write- Throws:
java.io.IOException
-
writeVnumber
private void writeVnumber(int kind, int v) throws java.io.IOException
- Throws:
java.io.IOException
-
writeVnumber
private void writeVnumber(int kind, long v) throws java.io.IOException
- Throws:
java.io.IOException
-
writeVnumber
private void writeVnumber(java.io.DataOutputStream s, int v) throws java.io.IOException
- Throws:
java.io.IOException
-
writeVnumber
private void writeVnumber(java.io.DataOutputStream s, long v) throws java.io.IOException
- Throws:
java.io.IOException
-
writeUnsignedByte
private void writeUnsignedByte(java.io.DataOutputStream s, int v) throws java.io.IOException
- Throws:
java.io.IOException
-
writeDouble
private void writeDouble(long raw) throws java.io.IOException
- Throws:
java.io.IOException
-
encodeIntSign
private int encodeIntSign(int v)
-
writeDiff
private void writeDiff(int kind, int v, int prev) throws java.io.IOException
Encoding: bit 6 = sign: 1 = negative bit 7 = delta: 1 = delta- Parameters:
kind
- the kind of sloti
- runs from iHeap + 3 to end of array- Throws:
java.io.IOException
- passthru
-
extractStrings
private void extractStrings(TOP fs)
add strings to the optimizestrings object If delta, only process for fs's that are new; modified string values picked up when scanning FsChange items- Parameters:
fs
- feature structure
-
extractStringsFromModifications
private void extractStringsFromModifications(CASImpl.FsChange fsChange)
For delta, for each fsChange element, extract any strings- Parameters:
fsChange
-
-
fs2seq
private int fs2seq(TOP fs)
-
-