Package org.codehaus.jackson.smile
Class SmileConstants
- java.lang.Object
-
- org.codehaus.jackson.smile.SmileConstants
-
public final class SmileConstants extends java.lang.Object
Constants used bySmileGenerator
andSmileParser
- Author:
- tatu
-
-
Field Summary
Fields Modifier and Type Field Description static byte
BYTE_MARKER_END_OF_CONTENT
In addition we can use a marker to allow simple framing; splitting of physical data (like file) into distinct logical sections like JSON documents.static byte
BYTE_MARKER_END_OF_STRING
static int
HEADER_BIT_HAS_RAW_BINARY
Indicator bit that indicates whether encoded content may contain raw (unquoted) binary values.static int
HEADER_BIT_HAS_SHARED_NAMES
Indicator bit that indicates whether encoded content may have Shared names (back references to recently encoded field names).static int
HEADER_BIT_HAS_SHARED_STRING_VALUES
Indicator bit that indicates whether encoded content may have shared String values (back references to recently encoded 'short' String values, where short is defined as 64 bytes or less).static byte
HEADER_BYTE_1
First byte of data headerstatic byte
HEADER_BYTE_2
Second byte of data headerstatic byte
HEADER_BYTE_3
Third byte of data headerstatic byte
HEADER_BYTE_4
Fourth byte of data header; contains version nibble, may have flagsstatic int
HEADER_VERSION_0
Current version consists of four zero bits (nibble)static int
INT_MARKER_END_OF_STRING
We need a byte marker to denote end of variable-length Strings.static int
MAX_SHARED_NAMES
Longest back reference we use for field names is 10 bits; no point in keeping much more aroundstatic int
MAX_SHARED_STRING_LENGTH_BYTES
Also: whereas we can refer to names of any length, we will only consider text values that are considered "tiny" or "short" (ones encoded with length prefix); this value thereby has to be maximum length of Strings that can be encoded as such.static int
MAX_SHARED_STRING_VALUES
Longest back reference we use for short shared String values is 10 bits, so up to (1 << 10) values to keep track of.static int
MAX_SHORT_NAME_ASCII_BYTES
Encoding has special "short" forms for field names that can be represented by 64 bytes of UTF-8 or less.static int
MAX_SHORT_NAME_UNICODE_BYTES
Maximum byte length for short non-ASCII names is slightly less due to having to reserve bytes 0xF8 and above (but we get one more as values 0 and 1 are not valid)static int
MAX_SHORT_VALUE_STRING_BYTES
Encoding has special "short" forms for value Strings that can be represented by 64 bytes of UTF-8 or less.static int
MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING
And to make encoding logic tight and simple, we can always require that output buffer has this amount of space available before encoding possibly short String (3 bytes since longest UTF-8 encoded Java char is 3 bytes).static int[]
sUtf8UnitLengths
Additionally we can combine UTF-8 decoding info into similar data table.static byte
TOKEN_KEY_EMPTY_STRING
Let's use same code for empty key as for empty String valuestatic byte
TOKEN_KEY_LONG_STRING
static byte
TOKEN_LITERAL_EMPTY_STRING
static byte
TOKEN_LITERAL_END_ARRAY
static byte
TOKEN_LITERAL_END_OBJECT
static byte
TOKEN_LITERAL_FALSE
static byte
TOKEN_LITERAL_NULL
static byte
TOKEN_LITERAL_START_ARRAY
static byte
TOKEN_LITERAL_START_OBJECT
static byte
TOKEN_LITERAL_TRUE
static int
TOKEN_MISC_BINARY_7BIT
Type (for misc, other) used for "safe" (encoded by only using 7 LSB, giving 8/7 expansion ratio).static int
TOKEN_MISC_BINARY_RAW
Raw binary data marker is specifically chosen as separate from other types, since it can have significant impact on framing (or rather fast scanning based on structure and framing markers).static int
TOKEN_MISC_FLOAT_32
Numeric subtype (2 LSB) forTOKEN_MISC_FP
, indicating 32-bit IEEE single precision floating point number.static int
TOKEN_MISC_FLOAT_64
Numeric subtype (2 LSB) forTOKEN_MISC_FP
, indicating 64-bit IEEE double precision floating point number.static int
TOKEN_MISC_FLOAT_BIG
Numeric subtype (2 LSB) forTOKEN_MISC_FP
, indicatingBigDecimal
type.static int
TOKEN_MISC_FP
Type (for misc, other) used for regular floating-point types (float, double)static int
TOKEN_MISC_INTEGER
Type (for misc, other) used for regular integral types (byte/short/int/long)static int
TOKEN_MISC_INTEGER_32
Numeric subtype (2 LSB) forTOKEN_MISC_INTEGER
, indicating 32-bit integer (int)static int
TOKEN_MISC_INTEGER_64
Numeric subtype (2 LSB) forTOKEN_MISC_INTEGER
, indicating 32-bit integer (long)static int
TOKEN_MISC_INTEGER_BIG
Numeric subtype (2 LSB) forTOKEN_MISC_INTEGER
, indicatingBigInteger
type.static int
TOKEN_MISC_LONG_TEXT_ASCII
Type (for misc, other) used for variable length UTF-8 encoded text, when it is known to only contain ASCII chars.static int
TOKEN_MISC_LONG_TEXT_UNICODE
Type (for misc, other) used for variable length UTF-8 encoded text, when it is NOT known to only contain ASCII chars (which means it MAY have multi-byte characters) Note: 2 LSB are reserved for future use; must be zeroes for nowstatic int
TOKEN_MISC_SHARED_STRING_LONG
Type (for misc, other) used for shared String values where index does not fit in "short" reference range (which is 0 - 30).static int
TOKEN_PREFIX_KEY_ASCII
static int
TOKEN_PREFIX_KEY_SHARED_LONG
static int
TOKEN_PREFIX_KEY_SHARED_SHORT
static int
TOKEN_PREFIX_KEY_UNICODE
static int
TOKEN_PREFIX_MISC_OTHER
static int
TOKEN_PREFIX_SHARED_STRING_SHORT
static int
TOKEN_PREFIX_SHORT_UNICODE
static int
TOKEN_PREFIX_SMALL_ASCII
static int
TOKEN_PREFIX_SMALL_INT
static int
TOKEN_PREFIX_TINY_ASCII
static int
TOKEN_PREFIX_TINY_UNICODE
-
Constructor Summary
Constructors Constructor Description SmileConstants()
-
-
-
Field Detail
-
MAX_SHORT_VALUE_STRING_BYTES
public static final int MAX_SHORT_VALUE_STRING_BYTES
Encoding has special "short" forms for value Strings that can be represented by 64 bytes of UTF-8 or less.- See Also:
- Constant Field Values
-
MAX_SHORT_NAME_ASCII_BYTES
public static final int MAX_SHORT_NAME_ASCII_BYTES
Encoding has special "short" forms for field names that can be represented by 64 bytes of UTF-8 or less.- See Also:
- Constant Field Values
-
MAX_SHORT_NAME_UNICODE_BYTES
public static final int MAX_SHORT_NAME_UNICODE_BYTES
Maximum byte length for short non-ASCII names is slightly less due to having to reserve bytes 0xF8 and above (but we get one more as values 0 and 1 are not valid)- See Also:
- Constant Field Values
-
MAX_SHARED_NAMES
public static final int MAX_SHARED_NAMES
Longest back reference we use for field names is 10 bits; no point in keeping much more around- See Also:
- Constant Field Values
-
MAX_SHARED_STRING_VALUES
public static final int MAX_SHARED_STRING_VALUES
Longest back reference we use for short shared String values is 10 bits, so up to (1 << 10) values to keep track of.- See Also:
- Constant Field Values
-
MAX_SHARED_STRING_LENGTH_BYTES
public static final int MAX_SHARED_STRING_LENGTH_BYTES
Also: whereas we can refer to names of any length, we will only consider text values that are considered "tiny" or "short" (ones encoded with length prefix); this value thereby has to be maximum length of Strings that can be encoded as such.- See Also:
- Constant Field Values
-
MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING
public static final int MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING
And to make encoding logic tight and simple, we can always require that output buffer has this amount of space available before encoding possibly short String (3 bytes since longest UTF-8 encoded Java char is 3 bytes). Two extra bytes need to be reserved as well; first for token indicator, and second for terminating null byte (in case it's not a short String after all)- See Also:
- Constant Field Values
-
INT_MARKER_END_OF_STRING
public static final int INT_MARKER_END_OF_STRING
We need a byte marker to denote end of variable-length Strings. Although null byte is commonly used, let's try to avoid using it since it can't be embedded in Web Sockets content (similarly, 0xFF can't). There are multiple candidates for bytes UTF-8 can not have; 0xFC is chosen to allow reasonable ordering (highest values meaning most significant framing function; 0xFF being end-of-content and so on)- See Also:
- Constant Field Values
-
BYTE_MARKER_END_OF_STRING
public static final byte BYTE_MARKER_END_OF_STRING
- See Also:
- Constant Field Values
-
BYTE_MARKER_END_OF_CONTENT
public static final byte BYTE_MARKER_END_OF_CONTENT
In addition we can use a marker to allow simple framing; splitting of physical data (like file) into distinct logical sections like JSON documents. 0xFF makes sense here since it is also used as end marker for Web Sockets.- See Also:
- Constant Field Values
-
HEADER_BYTE_1
public static final byte HEADER_BYTE_1
First byte of data header- See Also:
- Constant Field Values
-
HEADER_BYTE_2
public static final byte HEADER_BYTE_2
Second byte of data header- See Also:
- Constant Field Values
-
HEADER_BYTE_3
public static final byte HEADER_BYTE_3
Third byte of data header- See Also:
- Constant Field Values
-
HEADER_VERSION_0
public static final int HEADER_VERSION_0
Current version consists of four zero bits (nibble)- See Also:
- Constant Field Values
-
HEADER_BYTE_4
public static final byte HEADER_BYTE_4
Fourth byte of data header; contains version nibble, may have flags- See Also:
- Constant Field Values
-
HEADER_BIT_HAS_SHARED_NAMES
public static final int HEADER_BIT_HAS_SHARED_NAMES
Indicator bit that indicates whether encoded content may have Shared names (back references to recently encoded field names). If no header available, must be processed as if this was set to true. If (and only if) header exists, and value is 0, can parser omit storing of seen names, as it is guaranteed that no back references exist.- See Also:
- Constant Field Values
-
HEADER_BIT_HAS_SHARED_STRING_VALUES
public static final int HEADER_BIT_HAS_SHARED_STRING_VALUES
Indicator bit that indicates whether encoded content may have shared String values (back references to recently encoded 'short' String values, where short is defined as 64 bytes or less). If no header available, can be assumed to be 0 (false). If header exists, and bit value is 1, parsers has to store up to 1024 most recently seen distinct short String values.- See Also:
- Constant Field Values
-
HEADER_BIT_HAS_RAW_BINARY
public static final int HEADER_BIT_HAS_RAW_BINARY
Indicator bit that indicates whether encoded content may contain raw (unquoted) binary values. If no header available, can be assumed to be 0 (false). If header exists, and bit value is 1, parser can not assume that specific byte values always have default meaning (specifically, content end marker 0xFF and header signature can be contained in binary values)Note that this bit being true does not automatically mean that such raw binary content indeed exists; just that it may exist. This because header is written before any binary data may be written.
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_SHARED_STRING_SHORT
public static final int TOKEN_PREFIX_SHARED_STRING_SHORT
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_TINY_ASCII
public static final int TOKEN_PREFIX_TINY_ASCII
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_SMALL_ASCII
public static final int TOKEN_PREFIX_SMALL_ASCII
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_TINY_UNICODE
public static final int TOKEN_PREFIX_TINY_UNICODE
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_SHORT_UNICODE
public static final int TOKEN_PREFIX_SHORT_UNICODE
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_SMALL_INT
public static final int TOKEN_PREFIX_SMALL_INT
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_MISC_OTHER
public static final int TOKEN_PREFIX_MISC_OTHER
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_EMPTY_STRING
public static final byte TOKEN_LITERAL_EMPTY_STRING
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_NULL
public static final byte TOKEN_LITERAL_NULL
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_FALSE
public static final byte TOKEN_LITERAL_FALSE
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_TRUE
public static final byte TOKEN_LITERAL_TRUE
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_START_ARRAY
public static final byte TOKEN_LITERAL_START_ARRAY
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_END_ARRAY
public static final byte TOKEN_LITERAL_END_ARRAY
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_START_OBJECT
public static final byte TOKEN_LITERAL_START_OBJECT
- See Also:
- Constant Field Values
-
TOKEN_LITERAL_END_OBJECT
public static final byte TOKEN_LITERAL_END_OBJECT
- See Also:
- Constant Field Values
-
TOKEN_MISC_INTEGER
public static final int TOKEN_MISC_INTEGER
Type (for misc, other) used for regular integral types (byte/short/int/long)- See Also:
- Constant Field Values
-
TOKEN_MISC_FP
public static final int TOKEN_MISC_FP
Type (for misc, other) used for regular floating-point types (float, double)- See Also:
- Constant Field Values
-
TOKEN_MISC_LONG_TEXT_ASCII
public static final int TOKEN_MISC_LONG_TEXT_ASCII
Type (for misc, other) used for variable length UTF-8 encoded text, when it is known to only contain ASCII chars. Note: 2 LSB are reserved for future use; must be zeroes for now- See Also:
- Constant Field Values
-
TOKEN_MISC_LONG_TEXT_UNICODE
public static final int TOKEN_MISC_LONG_TEXT_UNICODE
Type (for misc, other) used for variable length UTF-8 encoded text, when it is NOT known to only contain ASCII chars (which means it MAY have multi-byte characters) Note: 2 LSB are reserved for future use; must be zeroes for now- See Also:
- Constant Field Values
-
TOKEN_MISC_BINARY_7BIT
public static final int TOKEN_MISC_BINARY_7BIT
Type (for misc, other) used for "safe" (encoded by only using 7 LSB, giving 8/7 expansion ratio). This is usually done to ensure that certain bytes are never included in encoded data (like 0xFF) Note: 2 LSB are reserved for future use; must be zeroes for now- See Also:
- Constant Field Values
-
TOKEN_MISC_SHARED_STRING_LONG
public static final int TOKEN_MISC_SHARED_STRING_LONG
Type (for misc, other) used for shared String values where index does not fit in "short" reference range (which is 0 - 30). If so, 2 LSB from here and full following byte are used to get 10-bit index. Values- See Also:
- Constant Field Values
-
TOKEN_MISC_BINARY_RAW
public static final int TOKEN_MISC_BINARY_RAW
Raw binary data marker is specifically chosen as separate from other types, since it can have significant impact on framing (or rather fast scanning based on structure and framing markers).- See Also:
- Constant Field Values
-
TOKEN_MISC_INTEGER_32
public static final int TOKEN_MISC_INTEGER_32
Numeric subtype (2 LSB) forTOKEN_MISC_INTEGER
, indicating 32-bit integer (int)- See Also:
- Constant Field Values
-
TOKEN_MISC_INTEGER_64
public static final int TOKEN_MISC_INTEGER_64
Numeric subtype (2 LSB) forTOKEN_MISC_INTEGER
, indicating 32-bit integer (long)- See Also:
- Constant Field Values
-
TOKEN_MISC_INTEGER_BIG
public static final int TOKEN_MISC_INTEGER_BIG
Numeric subtype (2 LSB) forTOKEN_MISC_INTEGER
, indicatingBigInteger
type.- See Also:
- Constant Field Values
-
TOKEN_MISC_FLOAT_32
public static final int TOKEN_MISC_FLOAT_32
Numeric subtype (2 LSB) forTOKEN_MISC_FP
, indicating 32-bit IEEE single precision floating point number.- See Also:
- Constant Field Values
-
TOKEN_MISC_FLOAT_64
public static final int TOKEN_MISC_FLOAT_64
Numeric subtype (2 LSB) forTOKEN_MISC_FP
, indicating 64-bit IEEE double precision floating point number.- See Also:
- Constant Field Values
-
TOKEN_MISC_FLOAT_BIG
public static final int TOKEN_MISC_FLOAT_BIG
Numeric subtype (2 LSB) forTOKEN_MISC_FP
, indicatingBigDecimal
type.- See Also:
- Constant Field Values
-
TOKEN_KEY_EMPTY_STRING
public static final byte TOKEN_KEY_EMPTY_STRING
Let's use same code for empty key as for empty String value- See Also:
- Constant Field Values
-
TOKEN_PREFIX_KEY_SHARED_LONG
public static final int TOKEN_PREFIX_KEY_SHARED_LONG
- See Also:
- Constant Field Values
-
TOKEN_KEY_LONG_STRING
public static final byte TOKEN_KEY_LONG_STRING
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_KEY_SHARED_SHORT
public static final int TOKEN_PREFIX_KEY_SHARED_SHORT
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_KEY_ASCII
public static final int TOKEN_PREFIX_KEY_ASCII
- See Also:
- Constant Field Values
-
TOKEN_PREFIX_KEY_UNICODE
public static final int TOKEN_PREFIX_KEY_UNICODE
- See Also:
- Constant Field Values
-
sUtf8UnitLengths
public static final int[] sUtf8UnitLengths
Additionally we can combine UTF-8 decoding info into similar data table. Values indicate "byte length - 1"; meaning -1 is used for invalid bytes, 0 for single-byte codes, 1 for 2-byte codes and 2 for 3-byte codes.
-
-