Class CsvSchema

  • All Implemented Interfaces:
    com.fasterxml.jackson.core.FormatSchema, java.io.Serializable, java.lang.Iterable<CsvSchema.Column>

    public class CsvSchema
    extends java.lang.Object
    implements com.fasterxml.jackson.core.FormatSchema, java.lang.Iterable<CsvSchema.Column>, java.io.Serializable
    Simple FormatSchema sub-type that defines properties of a CSV document to read or write. Properties supported currently are:
    • columns (List of ColumnDef) [default: empty List]: Ordered list of columns (which may be empty, see below). Each column has name (mandatory) as well as type (optional; if not defined, defaults to "String"). Note that
    • useHeader (boolean) [default: false]: whether the first line of physical document defines column names (true) or not (false): if enabled, parser will take first-line values to define column names; and generator will output column names as the first line
    • quoteChar (char) [default: double-quote ('")]: character used for quoting values that contain quote characters or linefeeds.
    • columnSeparator (char) [default: comma (',')]: character used to separate values. Other commonly used values include tab ('\t') and pipe ('|')
    • arrayElementSeparator (String) [default: semicolon (";")]: string used to separate array elements.
    • lineSeparator (String) [default: "\n"]: character used to separate data rows. Only used by generator; parser accepts three standard linefeeds ("\r", "\r\n", "\n").
    • escapeChar (int) [default: -1 meaning "none"]: character, if any, used to escape values. Most commonly defined as backslash ('\'). Only used by parser; generator only uses quoting, including doubling up of quotes to indicate quote char itself.
    • skipFirstDataRow (boolean) [default: false]: whether the first data line (either first line of the document, if useHeader=false, or second, if useHeader=true) should be completely ignored by parser. Needed to support CSV-like file formats that include additional non-data content before real data begins (specifically some database dumps do this)
    • nullValue (String) [default: "" (empty String)]: When asked to write Java `null`, this String value will be used instead.
      With 2.6, value will also be recognized during value reads.
    • strictHeaders (boolean) [default: false] (added in Jackson 2.7): whether names of columns defined in the schema MUST match with actual declaration from the header row (if header row handling enabled): if true, they must be and an exception if thrown if order differs: if false, no verification is performed.
    • allowComments (boolean) [default: false]: whether lines that start with character "#" are processed as comment lines and skipped/ignored.
    • anyProperty (String] [default: none]: if "any properties" (properties for 'extra' columns; ones not specified in schema) are enabled, they are mapped to this name: leaving it as null disables use of "any properties" (and they are either ignored, or an exception is thrown, depending on other settings); setting it to a non-null String value will expose all extra properties under one specified name. Most often used with Jackson @JsonAnySetter annotation.

      Note that schemas without any columns are legal, but if no columns are added, behavior of parser/generator is usually different and content will be exposed as logical Arrays instead of Objects.

      There are 4 ways to create CsvSchema instances:

    See Also:
    Serialized Form
    • Field Detail

      • ENCODING_FEATURE_USE_HEADER

        protected static final int ENCODING_FEATURE_USE_HEADER
        See Also:
        Constant Field Values
      • ENCODING_FEATURE_SKIP_FIRST_DATA_ROW

        protected static final int ENCODING_FEATURE_SKIP_FIRST_DATA_ROW
        See Also:
        Constant Field Values
      • ENCODING_FEATURE_ALLOW_COMMENTS

        protected static final int ENCODING_FEATURE_ALLOW_COMMENTS
        See Also:
        Constant Field Values
      • ENCODING_FEATURE_REORDER_COLUMNS

        protected static final int ENCODING_FEATURE_REORDER_COLUMNS
        See Also:
        Constant Field Values
      • ENCODING_FEATURE_STRICT_HEADERS

        protected static final int ENCODING_FEATURE_STRICT_HEADERS
        See Also:
        Constant Field Values
      • DEFAULT_ENCODING_FEATURES

        protected static final int DEFAULT_ENCODING_FEATURES
        See Also:
        Constant Field Values
      • NO_CHARS

        protected static final char[] NO_CHARS
      • DEFAULT_COLUMN_SEPARATOR

        public static final char DEFAULT_COLUMN_SEPARATOR
        Default separator for column values is comma (hence "Comma-Separated Values")
        See Also:
        Constant Field Values
      • DEFAULT_ARRAY_ELEMENT_SEPARATOR

        public static final java.lang.String DEFAULT_ARRAY_ELEMENT_SEPARATOR
        Default separator for array elements within a column value is semicolon.
        See Also:
        Constant Field Values
      • NO_ARRAY_ELEMENT_SEPARATOR

        public static final java.lang.String NO_ARRAY_ELEMENT_SEPARATOR
        Marker for the case where no array element separator is used
        See Also:
        Constant Field Values
      • DEFAULT_ANY_PROPERTY_NAME

        public static final java.lang.String DEFAULT_ANY_PROPERTY_NAME
        By default no "any properties" (properties for 'extra' columns; ones not specified in schema) are used, so null is used as marker.
        Since:
        2.7
      • DEFAULT_NULL_VALUE

        public static final char[] DEFAULT_NULL_VALUE
        By default, nulls are written as empty Strings (""); and no coercion is performed from any String (higher level databind may, however, coerce Strings into Java nulls). To use automatic coercion on reading, null value must be set explicitly to empty String ("").

        NOTE: before 2.6, this value default to empty char[]; changed to Java null in 2.6.

      • DEFAULT_ESCAPE_CHAR

        public static final int DEFAULT_ESCAPE_CHAR
        By default, no escape character is used -- this is denoted by int value that does not map to a valid character
        See Also:
        Constant Field Values
      • DEFAULT_LINEFEED

        public static final char[] DEFAULT_LINEFEED
      • _columns

        protected final CsvSchema.Column[] _columns
        Column definitions, needed for optional header and/or mapping of field names to column positions.
      • _columnsByName

        protected final java.util.Map<java.lang.String,​CsvSchema.Column> _columnsByName
      • _features

        protected int _features
        Bitflag for general-purpose on/off features.
        Since:
        2.5
      • _columnSeparator

        protected final char _columnSeparator
      • _arrayElementSeparator

        protected final java.lang.String _arrayElementSeparator
      • _quoteChar

        protected final int _quoteChar
      • _escapeChar

        protected final int _escapeChar
      • _lineSeparator

        protected final char[] _lineSeparator
      • _nullValue

        protected final char[] _nullValue
        Since:
        2.5
      • _nullValueAsString

        protected transient java.lang.String _nullValueAsString
      • _anyPropertyName

        protected final java.lang.String _anyPropertyName
        If "any properties" (properties for 'extra' columns; ones not specified in schema) are enabled, they are mapped to this name: leaving it as null disables use of "any properties" (and they are either ignored, or an exception is thrown, depending on other settings); setting it to a non-null String value will expose all extra properties under one specified name.
        Since:
        2.7
    • Constructor Detail

      • CsvSchema

        public CsvSchema​(CsvSchema.Column[] columns,
                         int features,
                         char columnSeparator,
                         int quoteChar,
                         int escapeChar,
                         char[] lineSeparator,
                         java.lang.String arrayElementSeparator,
                         char[] nullValue,
                         java.lang.String anyPropertyName)
        Since:
        2.7
      • CsvSchema

        protected CsvSchema​(CsvSchema.Column[] columns,
                            int features,
                            char columnSeparator,
                            int quoteChar,
                            int escapeChar,
                            char[] lineSeparator,
                            java.lang.String arrayElementSeparator,
                            char[] nullValue,
                            java.util.Map<java.lang.String,​CsvSchema.Column> columnsByName,
                            java.lang.String anyPropertyName)
        Copy constructor used for creating variants using withXxx() methods.
      • CsvSchema

        protected CsvSchema​(CsvSchema base,
                            CsvSchema.Column[] columns)
        Copy constructor used for creating variants using sortedBy() methods.
      • CsvSchema

        protected CsvSchema​(CsvSchema base,
                            int features)
        Copy constructor used for creating variants for on/off features
        Since:
        2.5
    • Method Detail

      • _link

        private static CsvSchema.Column[] _link​(CsvSchema.Column[] orig)
        Helper method used for chaining columns together using next-linkage, as well as ensuring that indexes are correct.
      • emptySchema

        public static CsvSchema emptySchema()
        Accessor for creating a "default" CSV schema instance, with following settings:
        • Does NOT use header line
        • Uses double quotes ('"') for quoting of field values (if necessary)
        • Uses comma (',') as the field separator
        • Uses Unix linefeed ('\n') as row separator
        • Does NOT use any escape characters
        • Does NOT have any columns defined
      • rebuild

        public CsvSchema.Builder rebuild()
        Helper method for constructing Builder that can be used to create modified schema.
      • withUseHeader

        public CsvSchema withUseHeader​(boolean state)
      • withColumnReordering

        public CsvSchema withColumnReordering​(boolean state)
        Returns a clone of this instance by changing or setting the column reordering flag
        Parameters:
        state - New value for setting
        Returns:
        A copy of itself, ensuring the setting for the column reordering feature.
        Since:
        2.7
      • withStrictHeaders

        public CsvSchema withStrictHeaders​(boolean state)
        Returns a clone of this instance by changing or setting the strict headers flag
        Parameters:
        state - New value for setting
        Returns:
        A copy of itself, ensuring the setting for the strict headers feature.
        Since:
        2.7
      • withHeader

        public CsvSchema withHeader()
        Helper method for constructing and returning schema instance that is similar to this one, except that it will be using header line.
      • withoutHeader

        public CsvSchema withoutHeader()
        Helper method for construcing and returning schema instance that is similar to this one, except that it will not be using header line.
      • withSkipFirstDataRow

        public CsvSchema withSkipFirstDataRow​(boolean state)
      • withAllowComments

        public CsvSchema withAllowComments​(boolean state)
        Method to indicate whether "hash comments" are allowed for document described by this schema.
        Since:
        2.5
      • withComments

        public CsvSchema withComments()
        Method to indicate that "hash comments" ARE allowed for document described by this schema.
        Since:
        2.5
      • withoutComments

        public CsvSchema withoutComments()
        Method to indicate that "hash comments" are NOT allowed for document described by this schema.
        Since:
        2.5
      • _withFeature

        protected CsvSchema _withFeature​(int feature,
                                         boolean state)
      • withColumnSeparator

        public CsvSchema withColumnSeparator​(char sep)
      • withQuoteChar

        public CsvSchema withQuoteChar​(char c)
      • withoutQuoteChar

        public CsvSchema withoutQuoteChar()
      • withEscapeChar

        public CsvSchema withEscapeChar​(char c)
      • withoutEscapeChar

        public CsvSchema withoutEscapeChar()
      • withArrayElementSeparator

        public CsvSchema withArrayElementSeparator​(java.lang.String separator)
        Since:
        2.7
      • withoutArrayElementSeparator

        public CsvSchema withoutArrayElementSeparator()
        Since:
        2.5
      • withLineSeparator

        public CsvSchema withLineSeparator​(java.lang.String sep)
      • withNullValue

        public CsvSchema withNullValue​(java.lang.String nvl)
        Since:
        2.5
      • withoutColumns

        public CsvSchema withoutColumns()
      • withColumnsFrom

        public CsvSchema withColumnsFrom​(CsvSchema toAppend)
        Mutant factory method that will try to combine columns of this schema with those from `toAppend`, starting with columns of this instance, and ignoring duplicates (if any) from argument `toAppend`. All settings aside from column sets are copied from `this` instance.

        As with all `withXxx()` methods this method never modifies `this` but either returns it unmodified (if no new columns found from `toAppend`), or constructs a new instance and returns that.

        Returns:
        Either this schema (if nothing changed), or newly constructed CsvSchema with appended columns.
        Since:
        2.9
      • withColumn

        public CsvSchema withColumn​(java.lang.String columnName,
                                    java.util.function.UnaryOperator<CsvSchema.Column> transformer)
        Mutant factory method that will try to replace specified column with changed definition (but same name), leaving other columns as-is.

        As with all `withXxx()` methods this method never modifies `this` but either returns it unmodified (if no change to column), or constructs a new schema instance and returns that.

        Parameters:
        columnName - Name of column to replace
        transformer - Transformation to apply to the column
        Returns:
        Either this schema (if column did not change), or newly constructed CsvSchema with changed column
        Since:
        2.18
      • withColumn

        public CsvSchema withColumn​(int columnIndex,
                                    java.util.function.UnaryOperator<CsvSchema.Column> transformer)
        Mutant factory method that will try to replace specified column with changed definition (but same name), leaving other columns as-is.

        As with all `withXxx()` methods this method never modifies `this` but either returns it unmodified (if no change to column), or constructs a new schema instance and returns that.

        Parameters:
        columnIndex - Index of column to replace
        transformer - Transformation to apply to the column
        Returns:
        Either this schema (if column did not change), or newly constructed CsvSchema with changed column
        Since:
        2.18
      • withAnyPropertyName

        public CsvSchema withAnyPropertyName​(java.lang.String name)
        Since:
        2.7
      • sortedBy

        public CsvSchema sortedBy​(java.lang.String... columnNames)
        Mutant factory method that will construct a new instance in which columns are sorted based on names given as argument. Columns not listed in argument will be sorted after those within list, using existing ordering.

        For example, schema that has columns:

        "a", "d", "c", "b"
        
        ordered with schema.sortedBy("a", "b"); would result instance that columns in order:
        "a", "b", "d", "c"
        
        Since:
        2.4
      • sortedBy

        public CsvSchema sortedBy​(java.util.Comparator<java.lang.String> cmp)
        Mutant factory method that will construct a new instance in which columns are sorted using given Comparator over column names.
        Since:
        2.4
      • getSchemaType

        public java.lang.String getSchemaType()
        Specified by:
        getSchemaType in interface com.fasterxml.jackson.core.FormatSchema
      • usesHeader

        public boolean usesHeader()
      • reordersColumns

        public boolean reordersColumns()
      • skipsFirstDataRow

        public boolean skipsFirstDataRow()
      • allowsComments

        public boolean allowsComments()
      • strictHeaders

        public boolean strictHeaders()
      • getColumnSeparator

        public char getColumnSeparator()
      • getArrayElementSeparator

        public java.lang.String getArrayElementSeparator()
      • getQuoteChar

        public int getQuoteChar()
      • getEscapeChar

        public int getEscapeChar()
      • getLineSeparator

        public char[] getLineSeparator()
      • getNullValue

        public char[] getNullValue()
        Returns:
        Null value defined, as char array, if one is defined to be recognized; Java null if not.
        Since:
        2.5
      • getNullValueOrEmpty

        public char[] getNullValueOrEmpty()
        Same as getNullValue() except that undefined null value (one that remains as null, or explicitly set as such) will be returned as empty char[]
        Since:
        2.6
      • getNullValueString

        public java.lang.String getNullValueString()
        Since:
        2.6
      • usesQuoteChar

        public boolean usesQuoteChar()
      • usesEscapeChar

        public boolean usesEscapeChar()
      • hasArrayElementSeparator

        public boolean hasArrayElementSeparator()
        Since:
        2.5
      • getAnyPropertyName

        public java.lang.String getAnyPropertyName()
        Since:
        2.7
      • size

        public int size()
        Accessor for finding out how many columns this schema defines.
        Returns:
        Number of columns this schema defines
      • column

        public CsvSchema.Column column​(int index)
        Accessor for column at specified index (0-based); index having to be within
            0 <= index < size()
        
      • columnIndex

        public int columnIndex​(java.lang.String name)
        Method for finding index of a named column within this schema.
        Parameters:
        name - Name of column to find
        Returns:
        Index of the specified column, if one exists; -1 if not
        Since:
        2.18
      • columnName

        public java.lang.String columnName​(int index)
        Since:
        2.6
      • column

        public CsvSchema.Column column​(java.lang.String name,
                                       int probableIndex)
        Optimized variant where a hint is given as to likely index of the column name.
        Since:
        2.6
      • getColumnNames

        public java.util.List<java.lang.String> getColumnNames()
        Accessor for getting names of included columns, in the order they are included in the schema.
        Since:
        2.14
      • getColumnNames

        public java.util.Collection<java.lang.String> getColumnNames​(java.util.Collection<java.lang.String> names)
        Accessor for getting names of included columns, added in given Collection.
        Since:
        2.14
      • getColumnDesc

        public java.lang.String getColumnDesc()
        Method for getting description of column definitions in developer-readable form
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • _validArrayElementSeparator

        protected static java.lang.String _validArrayElementSeparator​(java.lang.String sep)