Class CsvSchema

java.lang.Object
com.fasterxml.jackson.dataformat.csv.CsvSchema
All Implemented Interfaces:
com.fasterxml.jackson.core.FormatSchema, Serializable, Iterable<CsvSchema.Column>

public class CsvSchema extends Object implements com.fasterxml.jackson.core.FormatSchema, Iterable<CsvSchema.Column>, Serializable
Simple FormatSchema sub-type that defines properties of a CSV document to read or write. Properties supported currently are:
  • columns (List of ColumnDef) [default: empty List]: Ordered list of columns (which may be empty, see below). Each column has name (mandatory) as well as type (optional; if not defined, defaults to "String"). Note that
  • useHeader (boolean) [default: false]: whether the first line of physical document defines column names (true) or not (false): if enabled, parser will take first-line values to define column names; and generator will output column names as the first line
  • quoteChar (char) [default: double-quote ('")]: character used for quoting values that contain quote characters or linefeeds.
  • columnSeparator (char) [default: comma (',')]: character used to separate values. Other commonly used values include tab ('\t') and pipe ('|')
  • arrayElementSeparator (String) [default: semicolon (";")]: string used to separate array elements.
  • lineSeparator (String) [default: "\n"]: character used to separate data rows. Only used by generator; parser accepts three standard linefeeds ("\r", "\r\n", "\n").
  • escapeChar (int) [default: -1 meaning "none"]: character, if any, used to escape values. Most commonly defined as backslash ('\'). Only used by parser; generator only uses quoting, including doubling up of quotes to indicate quote char itself.
  • skipFirstDataRow (boolean) [default: false]: whether the first data line (either first line of the document, if useHeader=false, or second, if useHeader=true) should be completely ignored by parser. Needed to support CSV-like file formats that include additional non-data content before real data begins (specifically some database dumps do this)
  • nullValue (String) [default: "" (empty String)]: When asked to write Java `null`, this String value will be used instead.
    With 2.6, value will also be recognized during value reads.
  • strictHeaders (boolean) [default: false] (added in Jackson 2.7): whether names of columns defined in the schema MUST match with actual declaration from the header row (if header row handling enabled): if true, they must be and an exception if thrown if order differs: if false, no verification is performed.
  • allowComments (boolean) [default: false]: whether lines that start with character "#" are processed as comment lines and skipped/ignored.
  • anyProperty (String] [default: none]: if "any properties" (properties for 'extra' columns; ones not specified in schema) are enabled, they are mapped to this name: leaving it as null disables use of "any properties" (and they are either ignored, or an exception is thrown, depending on other settings); setting it to a non-null String value will expose all extra properties under one specified name. Most often used with Jackson @JsonAnySetter annotation.

    Note that schemas without any columns are legal, but if no columns are added, behavior of parser/generator is usually different and content will be exposed as logical Arrays instead of Objects.

    There are 4 ways to create CsvSchema instances:

See Also:
  • Field Details

    • serialVersionUID

      private static final long serialVersionUID
      See Also:
    • ENCODING_FEATURE_USE_HEADER

      protected static final int ENCODING_FEATURE_USE_HEADER
      See Also:
    • ENCODING_FEATURE_SKIP_FIRST_DATA_ROW

      protected static final int ENCODING_FEATURE_SKIP_FIRST_DATA_ROW
      See Also:
    • ENCODING_FEATURE_ALLOW_COMMENTS

      protected static final int ENCODING_FEATURE_ALLOW_COMMENTS
      See Also:
    • ENCODING_FEATURE_REORDER_COLUMNS

      protected static final int ENCODING_FEATURE_REORDER_COLUMNS
      See Also:
    • ENCODING_FEATURE_STRICT_HEADERS

      protected static final int ENCODING_FEATURE_STRICT_HEADERS
      See Also:
    • DEFAULT_ENCODING_FEATURES

      protected static final int DEFAULT_ENCODING_FEATURES
      See Also:
    • NO_CHARS

      protected static final char[] NO_CHARS
    • DEFAULT_COLUMN_SEPARATOR

      public static final char DEFAULT_COLUMN_SEPARATOR
      Default separator for column values is comma (hence "Comma-Separated Values")
      See Also:
    • DEFAULT_ARRAY_ELEMENT_SEPARATOR

      public static final String DEFAULT_ARRAY_ELEMENT_SEPARATOR
      Default separator for array elements within a column value is semicolon.
      See Also:
    • NO_ARRAY_ELEMENT_SEPARATOR

      public static final String NO_ARRAY_ELEMENT_SEPARATOR
      Marker for the case where no array element separator is used
      See Also:
    • DEFAULT_ANY_PROPERTY_NAME

      public static final String DEFAULT_ANY_PROPERTY_NAME
      By default no "any properties" (properties for 'extra' columns; ones not specified in schema) are used, so null is used as marker.
      Since:
      2.7
    • DEFAULT_QUOTE_CHAR

      public static final char DEFAULT_QUOTE_CHAR
      See Also:
    • DEFAULT_NULL_VALUE

      public static final char[] DEFAULT_NULL_VALUE
      By default, nulls are written as empty Strings (""); and no coercion is performed from any String (higher level databind may, however, coerce Strings into Java nulls). To use automatic coercion on reading, null value must be set explicitly to empty String ("").

      NOTE: before 2.6, this value default to empty char[]; changed to Java null in 2.6.

    • DEFAULT_ESCAPE_CHAR

      public static final int DEFAULT_ESCAPE_CHAR
      By default, no escape character is used -- this is denoted by int value that does not map to a valid character
      See Also:
    • DEFAULT_LINEFEED

      public static final char[] DEFAULT_LINEFEED
    • NO_COLUMNS

      protected static final CsvSchema.Column[] NO_COLUMNS
    • _columns

      protected final CsvSchema.Column[] _columns
      Column definitions, needed for optional header and/or mapping of field names to column positions.
    • _columnsByName

      protected final Map<String,CsvSchema.Column> _columnsByName
    • _features

      protected int _features
      Bitflag for general-purpose on/off features.
      Since:
      2.5
    • _columnSeparator

      protected final char _columnSeparator
    • _arrayElementSeparator

      protected final String _arrayElementSeparator
    • _quoteChar

      protected final int _quoteChar
    • _escapeChar

      protected final int _escapeChar
    • _lineSeparator

      protected final char[] _lineSeparator
    • _nullValue

      protected final char[] _nullValue
      Since:
      2.5
    • _nullValueAsString

      protected transient String _nullValueAsString
    • _anyPropertyName

      protected final String _anyPropertyName
      If "any properties" (properties for 'extra' columns; ones not specified in schema) are enabled, they are mapped to this name: leaving it as null disables use of "any properties" (and they are either ignored, or an exception is thrown, depending on other settings); setting it to a non-null String value will expose all extra properties under one specified name.
      Since:
      2.7
  • Constructor Details

    • CsvSchema

      public CsvSchema(CsvSchema.Column[] columns, int features, char columnSeparator, int quoteChar, int escapeChar, char[] lineSeparator, String arrayElementSeparator, char[] nullValue, String anyPropertyName)
      Since:
      2.7
    • CsvSchema

      protected CsvSchema(CsvSchema.Column[] columns, int features, char columnSeparator, int quoteChar, int escapeChar, char[] lineSeparator, String arrayElementSeparator, char[] nullValue, Map<String,CsvSchema.Column> columnsByName, String anyPropertyName)
      Copy constructor used for creating variants using withXxx() methods.
    • CsvSchema

      protected CsvSchema(CsvSchema base, CsvSchema.Column[] columns)
      Copy constructor used for creating variants using sortedBy() methods.
    • CsvSchema

      protected CsvSchema(CsvSchema base, int features)
      Copy constructor used for creating variants for on/off features
      Since:
      2.5
  • Method Details

    • _link

      private static CsvSchema.Column[] _link(CsvSchema.Column[] orig)
      Helper method used for chaining columns together using next-linkage, as well as ensuring that indexes are correct.
    • builder

      public static CsvSchema.Builder builder()
    • emptySchema

      public static CsvSchema emptySchema()
      Accessor for creating a "default" CSV schema instance, with following settings:
      • Does NOT use header line
      • Uses double quotes ('"') for quoting of field values (if necessary)
      • Uses comma (',') as the field separator
      • Uses Unix linefeed ('\n') as row separator
      • Does NOT use any escape characters
      • Does NOT have any columns defined
    • rebuild

      public CsvSchema.Builder rebuild()
      Helper method for constructing Builder that can be used to create modified schema.
    • withUseHeader

      public CsvSchema withUseHeader(boolean state)
    • withColumnReordering

      public CsvSchema withColumnReordering(boolean state)
      Returns a clone of this instance by changing or setting the column reordering flag
      Parameters:
      state - New value for setting
      Returns:
      A copy of itself, ensuring the setting for the column reordering feature.
      Since:
      2.7
    • withStrictHeaders

      public CsvSchema withStrictHeaders(boolean state)
      Returns a clone of this instance by changing or setting the strict headers flag
      Parameters:
      state - New value for setting
      Returns:
      A copy of itself, ensuring the setting for the strict headers feature.
      Since:
      2.7
    • withHeader

      public CsvSchema withHeader()
      Helper method for constructing and returning schema instance that is similar to this one, except that it will be using header line.
    • withoutHeader

      public CsvSchema withoutHeader()
      Helper method for construcing and returning schema instance that is similar to this one, except that it will not be using header line.
    • withSkipFirstDataRow

      public CsvSchema withSkipFirstDataRow(boolean state)
    • withAllowComments

      public CsvSchema withAllowComments(boolean state)
      Method to indicate whether "hash comments" are allowed for document described by this schema.
      Since:
      2.5
    • withComments

      public CsvSchema withComments()
      Method to indicate that "hash comments" ARE allowed for document described by this schema.
      Since:
      2.5
    • withoutComments

      public CsvSchema withoutComments()
      Method to indicate that "hash comments" are NOT allowed for document described by this schema.
      Since:
      2.5
    • _withFeature

      protected CsvSchema _withFeature(int feature, boolean state)
    • withColumnSeparator

      public CsvSchema withColumnSeparator(char sep)
    • withQuoteChar

      public CsvSchema withQuoteChar(char c)
    • withoutQuoteChar

      public CsvSchema withoutQuoteChar()
    • withEscapeChar

      public CsvSchema withEscapeChar(char c)
    • withoutEscapeChar

      public CsvSchema withoutEscapeChar()
    • withArrayElementSeparator

      public CsvSchema withArrayElementSeparator(String separator)
      Since:
      2.7
    • withoutArrayElementSeparator

      public CsvSchema withoutArrayElementSeparator()
      Since:
      2.5
    • withLineSeparator

      public CsvSchema withLineSeparator(String sep)
    • withNullValue

      public CsvSchema withNullValue(String nvl)
      Since:
      2.5
    • withoutColumns

      public CsvSchema withoutColumns()
    • withColumnsFrom

      public CsvSchema withColumnsFrom(CsvSchema toAppend)
      Mutant factory method that will try to combine columns of this schema with those from `toAppend`, starting with columns of this instance, and ignoring duplicates (if any) from argument `toAppend`. All settings aside from column sets are copied from `this` instance.

      As with all `withXxx()` methods this method never modifies `this` but either returns it unmodified (if no new columns found from `toAppend`), or constructs a new instance and returns that.

      Returns:
      Either this schema (if nothing changed), or newly constructed CsvSchema with appended columns.
      Since:
      2.9
    • withColumn

      public CsvSchema withColumn(String columnName, UnaryOperator<CsvSchema.Column> transformer)
      Mutant factory method that will try to replace specified column with changed definition (but same name), leaving other columns as-is.

      As with all `withXxx()` methods this method never modifies `this` but either returns it unmodified (if no change to column), or constructs a new schema instance and returns that.

      Parameters:
      columnName - Name of column to replace
      transformer - Transformation to apply to the column
      Returns:
      Either this schema (if column did not change), or newly constructed CsvSchema with changed column
      Since:
      2.18
    • withColumn

      public CsvSchema withColumn(int columnIndex, UnaryOperator<CsvSchema.Column> transformer)
      Mutant factory method that will try to replace specified column with changed definition (but same name), leaving other columns as-is.

      As with all `withXxx()` methods this method never modifies `this` but either returns it unmodified (if no change to column), or constructs a new schema instance and returns that.

      Parameters:
      columnIndex - Index of column to replace
      transformer - Transformation to apply to the column
      Returns:
      Either this schema (if column did not change), or newly constructed CsvSchema with changed column
      Since:
      2.18
    • _withColumn

      protected CsvSchema _withColumn(int ix, CsvSchema.Column toReplace)
      Since:
      2.18
    • withAnyPropertyName

      public CsvSchema withAnyPropertyName(String name)
      Since:
      2.7
    • sortedBy

      public CsvSchema sortedBy(String... columnNames)
      Mutant factory method that will construct a new instance in which columns are sorted based on names given as argument. Columns not listed in argument will be sorted after those within list, using existing ordering.

      For example, schema that has columns:

      "a", "d", "c", "b"
      
      ordered with schema.sortedBy("a", "b"); would result instance that columns in order:
      "a", "b", "d", "c"
      
      Since:
      2.4
    • sortedBy

      public CsvSchema sortedBy(Comparator<String> cmp)
      Mutant factory method that will construct a new instance in which columns are sorted using given Comparator over column names.
      Since:
      2.4
    • getSchemaType

      public String getSchemaType()
      Specified by:
      getSchemaType in interface com.fasterxml.jackson.core.FormatSchema
    • usesHeader

      public boolean usesHeader()
    • reordersColumns

      public boolean reordersColumns()
    • skipsFirstDataRow

      public boolean skipsFirstDataRow()
    • allowsComments

      public boolean allowsComments()
    • strictHeaders

      public boolean strictHeaders()
    • getColumnSeparator

      public char getColumnSeparator()
    • getArrayElementSeparator

      public String getArrayElementSeparator()
    • getQuoteChar

      public int getQuoteChar()
    • getEscapeChar

      public int getEscapeChar()
    • getLineSeparator

      public char[] getLineSeparator()
    • getNullValue

      public char[] getNullValue()
      Returns:
      Null value defined, as char array, if one is defined to be recognized; Java null if not.
      Since:
      2.5
    • getNullValueOrEmpty

      public char[] getNullValueOrEmpty()
      Same as getNullValue() except that undefined null value (one that remains as null, or explicitly set as such) will be returned as empty char[]
      Since:
      2.6
    • getNullValueString

      public String getNullValueString()
      Since:
      2.6
    • usesQuoteChar

      public boolean usesQuoteChar()
    • usesEscapeChar

      public boolean usesEscapeChar()
    • hasArrayElementSeparator

      public boolean hasArrayElementSeparator()
      Since:
      2.5
    • getAnyPropertyName

      public String getAnyPropertyName()
      Since:
      2.7
    • iterator

      public Iterator<CsvSchema.Column> iterator()
      Specified by:
      iterator in interface Iterable<CsvSchema.Column>
    • size

      public int size()
      Accessor for finding out how many columns this schema defines.
      Returns:
      Number of columns this schema defines
    • column

      public CsvSchema.Column column(int index)
      Accessor for column at specified index (0-based); index having to be within
          0 <= index < size()
      
    • columnIndex

      public int columnIndex(String name)
      Method for finding index of a named column within this schema.
      Parameters:
      name - Name of column to find
      Returns:
      Index of the specified column, if one exists; -1 if not
      Since:
      2.18
    • columnName

      public String columnName(int index)
      Since:
      2.6
    • column

      public CsvSchema.Column column(String name)
    • column

      public CsvSchema.Column column(String name, int probableIndex)
      Optimized variant where a hint is given as to likely index of the column name.
      Since:
      2.6
    • getColumnNames

      public List<String> getColumnNames()
      Accessor for getting names of included columns, in the order they are included in the schema.
      Since:
      2.14
    • getColumnNames

      public Collection<String> getColumnNames(Collection<String> names)
      Accessor for getting names of included columns, added in given Collection.
      Since:
      2.14
    • getColumnDesc

      public String getColumnDesc()
      Method for getting description of column definitions in developer-readable form
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • _validArrayElementSeparator

      protected static String _validArrayElementSeparator(String sep)