Class UnitFormat
- All Implemented Interfaces:
Serializable
,Cloneable
,javax.measure.format.UnitFormat
,Localized
java.text
and the API from javax.measure.format
.
In addition to the symbols of the Système international (SI), this class is also capable to handle
some symbols found in Well Known Text (WKT) definitions or in XML files.
Parsing authority codes
If a character sequence given to theparse(CharSequence)
method is of the form "EPSG:####"
,
"urn:ogc:def:uom:EPSG::####"
or "http://www.opengis.net/def/uom/EPSG/0/####"
(ignoring case
and whitespaces around path separators), then "####"
is parsed as an integer and forwarded to the
Units.valueOfEPSG(int)
method.
Note on netCDF unit symbols
In netCDF files, values of "unit" attribute are concatenations of an angular unit with an axis direction, as in"degrees_east"
or "degrees_north"
. This class ignores those suffixes and unconditionally
returns Units.DEGREE
for all axis directions.
Multi-threading
UnitFormat
is generally not thread-safe. If units need to be parsed or formatted in different threads,
each thread should have its own UnitFormat
instance.- Since:
- 0.8
- Version:
- 1.3
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static final class
Represents an operation to be applied between two terms parsed byparseTerm(CharSequence, int, int, Operation)
.private static final class
Parse position when text to be parsed is expected to contain nothing else than a unit symbol.static enum
Identify whether unit formatting uses ASCII symbols, Unicode symbols or full localized names.Nested classes/interfaces inherited from class java.text.Format
Format.Field
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final String
The unit name for degrees (not necessarily angular), to be handled in a special way.(package private) static final UnitFormat
The default instance used byUnits.valueOf(String)
for parsing units of measurement.Units associated to a given label (in addition to the system-wideUnitRegistry
).private Locale
The locale specified at construction time or modified bysetLocale(Locale)
.Mapping from long localized and unlocalized names to unit instances.private static final boolean
Whether the parsing of authority codes such as"EPSG:9001"
is allowed.private static final long
For cross-version compatibility.private static final WeakValueHashMap<Locale,
Map<String, javax.measure.Unit<?>>> Cached values ofnameToUnit
, for avoiding to load the same information many time and for saving memory if the user create manyUnitFormat
instances.private UnitFormat.Style
Whether thisUnitFormat
should format long names like "metre" or use unit symbols.private ResourceBundle
The mapping from unit symbols to long localized names.Symbols or names to use for formatting units in replacement to the default unit symbols or names.private static final String
The unit name for dimensionless unit. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivate
Creates the uniqueINSTANCE
.UnitFormat
(Locale locale) Creates a new format for the given locale. -
Method Summary
Modifier and TypeMethodDescriptionclone()
Returns a clone of this unit format.private static Object
Clones the given map, which can be either aHashMap
or the instance returned byCollections.emptyMap()
.private static void
copy
(Locale locale, ResourceBundle symbolToName, Map<String, javax.measure.Unit<?>> nameToUnit) Copies all entries from the given "symbols to names" mapping to the given "names to units" mapping.private static int
exponentOperator
(CharSequence symbols, int i, int length) Returns0
or1
if the'*'
character at the given index stands for exponentiation instead of multiplication, or a negative value if the character stands for multiplication.private static void
finish
(ParsePosition pos) Reports that the parsing is finished and no more content should be parsed.format
(Object unit, StringBuffer toAppendTo, FieldPosition pos) Formats the specified unit in the given buffer.format
(javax.measure.Unit<?> unit) Formats the given unit.format
(javax.measure.Unit<?> unit, Appendable toAppendTo) Formats the specified unit.private static void
formatComponent
(Map.Entry<?, ? extends Number> entry, boolean inverse, UnitFormat.Style style, Appendable toAppendTo) Formats a single unit or dimension raised to the given power.(package private) static void
formatComponents
(Map<?, ? extends Number> components, UnitFormat.Style style, Appendable toAppendTo) Creates a new symbol (e.g.private static void
formatSymbol
(Object base, UnitFormat.Style style, Appendable toAppendTo) Appends the symbol for the given base unit of base dimension, or "?" if no symbol was found.private javax.measure.Unit<?>
Returns the unit instance for the given long (un)localized name.(package private) static ResourceBundle
Loads theUnitNames
resource bundle for the given locale.Returns the locale used by thisUnitFormat
.getStyle()
Returns whether unit formatting uses ASCII symbols, Unicode symbols or full localized names.private static boolean
hasDigit
(CharSequence symbol, int lower, int upper) Returnstrue
if the given character sequence contains at least one digit.private static boolean
isDecimalSeparator
(CharSequence symbols, int i, int length) Returnstrue
if the'.'
character at the given index is surrounded by digits or is at the beginning or the end of the character sequences.private static boolean
isDigit
(int c) Returnstrue
if the given character is a digit in the sense of theUnitFormat
parser.private static boolean
isDivisor
(int c) Returnstrue
if the given character is the sign of a division operator.boolean
Returns whether thisUnitFormat
depends on theLocale
given at construction time for performing its tasks.private static boolean
isSign
(int c) Returnstrue
if the given character is the sign of a number according theUnitFormat
parser.void
Attaches a label to the specified unit.javax.measure.Unit<?>
parse
(CharSequence symbols) Parses the given text as an instance ofUnit
.javax.measure.Unit<?>
parse
(CharSequence symbols, ParsePosition position) Parses a portion of the given text as an instance ofUnit
.private static double
Parses a multiplication factor, which may be a single number or a base raised to an exponent.parseObject
(String source) Parses text from a string to produce a unit.parseObject
(String source, ParsePosition pos) Parses text from a string to produce a unit, or returnsnull
if the parsing failed.private javax.measure.Unit<?>
parseTerm
(CharSequence symbols, int lower, int upper, UnitFormat.Operation operation) Parses a single unit symbol with its exponent.void
Sets the locale that thisUnitFormat
will use for long names.void
setStyle
(UnitFormat.Style style) Sets whether unit formatting should use ASCII symbols, Unicode symbols or full localized names.private ResourceBundle
Returns the mapping from unit symbols to long localized names.Methods inherited from class java.text.Format
format, formatToCharacterIterator
-
Field Details
-
serialVersionUID
private static final long serialVersionUIDFor cross-version compatibility.- See Also:
-
PARSE_AUTHORITY_CODES
private static final boolean PARSE_AUTHORITY_CODESWhether the parsing of authority codes such as"EPSG:9001"
is allowed.- See Also:
-
DEGREES
The unit name for degrees (not necessarily angular), to be handled in a special way. Must contain only ASCII lower case letters ([a … z]).- See Also:
-
UNITY
The unit name for dimensionless unit.- See Also:
-
INSTANCE
The default instance used byUnits.valueOf(String)
for parsing units of measurement. WhileUnitFormat
is generally not thread-safe, this particular instance is safe if we never invoke any setter method and we do not format withUnitFormat.Style.NAME
. -
locale
The locale specified at construction time or modified bysetLocale(Locale)
.- See Also:
-
style
Whether thisUnitFormat
should format long names like "metre" or use unit symbols.- See Also:
-
unitToLabel
Symbols or names to use for formatting units in replacement to the default unit symbols or names. TheUnit
instances are the ones specified by user in calls tolabel(Unit, String)
.- See Also:
-
labelToUnit
Units associated to a given label (in addition to the system-wideUnitRegistry
). This map is the converse ofunitToLabel
. TheUnit
instances may differ from the ones specified by user sinceAbstractUnit.symbol
may have been set to the label specified by the user. The labels may contain some characters normally not allowed in unit symbols, like white spaces.- See Also:
-
symbolToName
The mapping from unit symbols to long localized names. Those resources are locale-dependent and loaded when first needed.- See Also:
-
nameToUnit
Mapping from long localized and unlocalized names to unit instances. This map is used only for parsing and created when first needed.- See Also:
-
SHARED
Cached values ofnameToUnit
, for avoiding to load the same information many time and for saving memory if the user create manyUnitFormat
instances. Note that we do not cachesymbolToName
becauseResourceBundle
already provides its own caching mechanism.- See Also:
-
-
Constructor Details
-
UnitFormat
private UnitFormat()Creates the uniqueINSTANCE
. -
UnitFormat
Creates a new format for the given locale.- Parameters:
locale
- the locale to use for parsing and formatting units.
-
-
Method Details
-
getLocale
Returns the locale used by thisUnitFormat
. -
setLocale
Sets the locale that thisUnitFormat
will use for long names. For example, a call tosetLocale(Locale.US)
instructs this formatter to use the “meter” spelling instead of “metre”.- Parameters:
locale
- the new locale for thisUnitFormat
.- See Also:
-
isLocaleSensitive
public boolean isLocaleSensitive()Returns whether thisUnitFormat
depends on theLocale
given at construction time for performing its tasks. This method returnstrue
if formatting long names (e.g. “metre” or “meter”} andfalse
if formatting only the unit symbol (e.g. “m”).- Specified by:
isLocaleSensitive
in interfacejavax.measure.format.UnitFormat
- Returns:
true
if formatting depends on the locale.
-
getStyle
Returns whether unit formatting uses ASCII symbols, Unicode symbols or full localized names.- Returns:
- the style of units formatted by this
UnitFormat
instance.
-
setStyle
Sets whether unit formatting should use ASCII symbols, Unicode symbols or full localized names.- Parameters:
style
- the desired style of units.
-
label
Attaches a label to the specified unit. A label can be a substitute to either the unit symbol or theunit name
, depending on the format style. If the specified label is already associated to another unit, then the previous association is discarded.Restriction on character set
Current implementation accepts only letters, subscripts, spaces (including non-breaking spaces but not CR/LF characters), the degree sign (°) and a few other characters like underscore. The set of legal characters may be expanded in future Apache SIS versions, but the following restrictions are likely to remain:- The following characters are reserved since they have special meaning in UCUM format, in URI
or in Apache SIS parser:
" # ( ) * + - . / : = ? [ ] { } ^ ⋅ ∕
- The symbol cannot begin or end with digits, since such digits would be confused with unit power.
- Specified by:
label
in interfacejavax.measure.format.UnitFormat
- Parameters:
unit
- the unit being labeled.label
- the new label for the given unit.- Throws:
IllegalArgumentException
- if the given label is not a valid unit name.
- The following characters are reserved since they have special meaning in UCUM format, in URI
or in Apache SIS parser:
-
getBundle
Loads theUnitNames
resource bundle for the given locale. -
symbolToName
Returns the mapping from unit symbols to long localized names. This mapping is loaded when first needed and memorized as long as the locale does not change. -
fromName
Returns the unit instance for the given long (un)localized name. This method is somewhat the converse ofsymbolToName()
, but recognizes also international and American spelling of unit names in addition of localized names. The intent is to recognize "meter" as well as "metre".While we said that
UnitFormat
is not thread safe, we make an exception for this method for allowing the singletonINSTANCE
to parse symbols in a multi-threads environment.- Parameters:
uom
- the unit symbol, without leading or trailing spaces.- Returns:
- the unit for the given name, or
null
if unknown.
-
copy
private static void copy(Locale locale, ResourceBundle symbolToName, Map<String, javax.measure.Unit<?>> nameToUnit) Copies all entries from the given "symbols to names" mapping to the given "names to units" mapping. During this copy, keys are converted from symbols to names and values are converted from symbols toUnit
instance. We useUnit
values instead of their symbols because allUnit
instances are created atUnits
class initialization anyway (so we do not create new instance here), and it avoid to retain references to theString
instances loaded by the resource bundle. -
format
Formats the specified unit. This method performs the first of the following actions that can be done.- If a label has been specified for the given unit, then that label is appended unconditionally.
- Otherwise if the formatting style is
UnitFormat.Style.NAME
and theUnit.getName()
method returns a non-null value, then that value is appended.Unit
instances implemented by Apache SIS are handled in a special way for localizing the name according the locale specified to this format. - Otherwise if the
Unit.getSymbol()
method returns a non-null value, then that value is appended. - Otherwise a default symbol is created from the entries returned by
Unit.getBaseUnits()
.
- Specified by:
format
in interfacejavax.measure.format.UnitFormat
- Parameters:
unit
- the unit to format.toAppendTo
- where to format the unit.- Returns:
- the given
toAppendTo
argument, for method calls chaining. - Throws:
IOException
- if an error occurred while writing to the destination.
-
formatComponents
static void formatComponents(Map<?, ? extends Number> components, UnitFormat.Style style, Appendable toAppendTo) throws IOExceptionCreates a new symbol (e.g. "m/s") from the given symbols and factors. Keys in the given map can be eitherUnit
orDimension
instances. Values in the given map are eitherInteger
orFraction
instances.- Parameters:
components
- the components of the symbol to format.style
- whether to allow Unicode characters.toAppendTo
- where to write the symbol.- Throws:
IOException
-
formatComponent
private static void formatComponent(Map.Entry<?, ? extends Number> entry, boolean inverse, UnitFormat.Style style, Appendable toAppendTo) throws IOExceptionFormats a single unit or dimension raised to the given power.- Parameters:
entry
- the base unit or base dimension to format, together with its power.inverse
-true
for inverting the power sign.style
- whether to allow Unicode characters.- Throws:
IOException
-
formatSymbol
private static void formatSymbol(Object base, UnitFormat.Style style, Appendable toAppendTo) throws IOException Appends the symbol for the given base unit of base dimension, or "?" if no symbol was found. If the given object is a unit, then it should be an instance ofSystemUnit
.- Parameters:
base
- the base unit or base dimension to format.style
- whether to allow Unicode characters.toAppendTo
- where to append the symbol.- Throws:
IOException
-
format
Formats the specified unit in the given buffer. This method delegates toformat(Unit, Appendable)
. -
format
Formats the given unit. This method delegates toformat(Unit, Appendable)
.- Specified by:
format
in interfacejavax.measure.format.UnitFormat
- Parameters:
unit
- the unit to format.- Returns:
- the formatted unit.
-
exponentOperator
Returns0
or1
if the'*'
character at the given index stands for exponentiation instead of multiplication, or a negative value if the character stands for multiplication. This check is used for heuristic rules at parsing time. Current implementation applies the following rules:- The operation is presumed an exponentiation if the '*' symbol is doubled, as in
"m**s-1"
. - The operation is presumed an exponentiation if it is surrounded by digits or a sign on its right side.
Example:
"10*-6"
, which means 1E-6 in UCUM syntax. - All other cases are currently presumed multiplication.
Example:
"m*s"
.
- Returns:
- -1 for parsing as a multiplication, or a positive value for exponentiation. If positive, this is the number of characters in the exponent symbol minus 1.
- The operation is presumed an exponentiation if the '*' symbol is doubled, as in
-
isDecimalSeparator
Returnstrue
if the'.'
character at the given index is surrounded by digits or is at the beginning or the end of the character sequences. This check is used for heuristic rules.- See Also:
-
isDigit
private static boolean isDigit(int c) Returnstrue
if the given character is a digit in the sense of theUnitFormat
parser. Note that "digit" is taken here in a much more restrictive way thanCharacter.isDigit(int)
.A return value of
true
guarantees that the given character is in the Basic Multilingual Plane (BMP). Consequently, thec
argument value does not need to be the result ofString.codePointAt(int)
; the result ofString.charAt(int)
is sufficient. We nevertheless use theint
type for avoiding the need to cast if caller uses code points for another reason.- See Also:
-
isSign
private static boolean isSign(int c) Returnstrue
if the given character is the sign of a number according theUnitFormat
parser. A return value oftrue
guarantees that the given character is in the Basic Multilingual Plane (BMP). Consequently, thec
argument value does not need to be the result ofString.codePointAt(int)
. -
isDivisor
private static boolean isDivisor(int c) Returnstrue
if the given character is the sign of a division operator. A return value oftrue
guarantees that the given character is in the Basic Multilingual Plane (BMP). Consequently, thec
argument value does not need to be the result ofString.codePointAt(int)
. -
hasDigit
Returnstrue
if the given character sequence contains at least one digit. This is a hack for allowing to recognize units like "100 feet" (in principle not legal, but seen in practice). This verification has some value if digits are not allowed as unit label or symbol. -
finish
Reports that the parsing is finished and no more content should be parsed. This method is invoked when the last parsed term is possibly one or more words instead of unit symbols. The intent is to avoid trying to parse "degree minute" as "degree × minute". By contrast, this method is not invoked if the string to parse is "m kg**-2" because it can be interpreted as "m × kg**-2". -
parse
public javax.measure.Unit<?> parse(CharSequence symbols) throws javax.measure.format.ParserException Parses the given text as an instance ofUnit
. If the parse completes without reading the entire length of the text, an exception is thrown.The parsing is lenient: symbols can be products or quotients of units like “m∕s”, words like “meters per second”, or authority codes like
"urn:ogc:def:uom:EPSG::1026"
. The product operator can be either'.'
(ASCII) or'⋅'
(Unicode) character. Exponent after symbol can be decimal digits as in “m2” or a superscript as in “m²”.This method differs from
parse(CharSequence, ParsePosition)
in the treatment of white spaces: that method with aParsePosition
argument stops parsing at the first white space, while thisparse(…)
method treats white spaces as multiplications. The reason for this difference is that white space is normally not a valid multiplication symbol; it could be followed by a text which is not part of the unit symbol. But in the case of thisparse(CharSequence)
method, the wholeCharSequence
shall be a unit symbol. In such case, white spaces are less ambiguous.The default implementation delegates to
parse(symbols, new ParsePosition(0))
and verifies that all non-white characters have been parsed. Units separated by spaces are multiplied; for example "kg m**-2" is parsed as kg/m².- Specified by:
parse
in interfacejavax.measure.format.UnitFormat
- Parameters:
symbols
- the unit symbols or URI to parse.- Returns:
- the unit parsed from the specified symbols.
- Throws:
javax.measure.format.ParserException
- if a problem occurred while parsing the given symbols.- See Also:
-
parse
public javax.measure.Unit<?> parse(CharSequence symbols, ParsePosition position) throws javax.measure.format.ParserException Parses a portion of the given text as an instance ofUnit
. Parsing begins at the index given byParsePosition.getIndex()
. After parsing, the above-cited index is updated to the first unparsed character.The parsing is lenient: symbols can be products or quotients of units like “m∕s”, words like “meters per second”, or authority codes like
"urn:ogc:def:uom:EPSG::1026"
. The product operator can be either'.'
(ASCII) or'⋅'
(Unicode) character. Exponent after symbol can be decimal digits as in “m2” or a superscript as in “m²”.Note that contrarily to
parseObject(String, ParsePosition)
, this method never returnnull
. If an error occurs at parsing time, an uncheckedParserException
is thrown.- Parameters:
symbols
- the unit symbols to parse.position
- on input, index of the first character to parse. On output, index after the last parsed character.- Returns:
- the unit parsed from the specified symbols.
- Throws:
javax.measure.format.ParserException
- if a problem occurred while parsing the given symbols.
-
parseTerm
private javax.measure.Unit<?> parseTerm(CharSequence symbols, int lower, int upper, UnitFormat.Operation operation) throws javax.measure.format.ParserException Parses a single unit symbol with its exponent. The given symbol shall not contain multiplication or division operator except in exponent. Parsing of fractional exponent as in "m2/3" is supported; other operations in the exponent will cause an exception to be thrown.- Parameters:
symbols
- the complete string specified by the user.lower
- index where to begin parsing in thesymbols
string.upper
- index after the last character to parse in thesymbols
string.operation
- the operation to be applied (e.g. the term to be parsed is a multiplier or divisor of another unit).- Returns:
- the parsed unit symbol (never
null
). - Throws:
javax.measure.format.ParserException
- if a problem occurred while parsing the given symbols.
-
parseMultiplicationFactor
Parses a multiplication factor, which may be a single number or a base raised to an exponent. For example, all the following strings are equivalent: "1000", "1000.0", "1E3", "10*3", "10^3", "10³".- Throws:
NumberFormatException
-
parseObject
Parses text from a string to produce a unit. The default implementation delegates toparse(CharSequence)
and wraps theParserException
into aParseException
for compatibility withjava.text
API.- Overrides:
parseObject
in classFormat
- Parameters:
source
- the text, part of which should be parsed.- Returns:
- a unit parsed from the string.
- Throws:
ParseException
- if the given string cannot be fully parsed.
-
parseObject
Parses text from a string to produce a unit, or returnsnull
if the parsing failed. The default implementation delegates toparse(CharSequence, ParsePosition)
and catches theParserException
.- Specified by:
parseObject
in classFormat
- Parameters:
source
- the text, part of which should be parsed.pos
- index and error index information as described above.- Returns:
- a unit parsed from the string, or
null
in case of error.
-
clone
-
clone
Clones the given map, which can be either aHashMap
or the instance returned byCollections.emptyMap()
.
-