Package com.opencsv

Class CSVParser

All Implemented Interfaces:
ICSVParser

public class CSVParser extends AbstractCSVParser

A very simple CSV parser released under a commercial-friendly license. This just implements splitting a single line into fields.

The purpose of the CSVParser is to take a single string and parse it into its elements based on the delimiter, quote and escape characters.

The CSVParser has grown organically based on user requests and does not truly match any current requirements (though it can be configured to match or come close). There are no plans to change this as it will break existing requirements. Consider using the RFC4180Parser for less configurability but closer match to the RFC4180 requirements.

  • Field Details

    • BEGINNING_OF_LINE

      private static final int BEGINNING_OF_LINE
      See Also:
    • escape

      private final char escape
      This is the character that the CSVParser will treat as the escape character.
    • escapeAsString

      private final String escapeAsString
      String of escape character - optimization for replaceAll
    • escapeDoubleAsString

      private final String escapeDoubleAsString
      String escapeAsString+escapeAsString - optimization for replaceAll
    • strictQuotes

      private final boolean strictQuotes
      Determines if the field is between quotes (true) or between separators (false).
    • ignoreLeadingWhiteSpace

      private final boolean ignoreLeadingWhiteSpace
      Ignore any leading white space at the start of the field.
    • ignoreQuotations

      private final boolean ignoreQuotations
      Skip over quotation characters when parsing.
    • tokensOnLastCompleteLine

      private int tokensOnLastCompleteLine
    • inField

      private boolean inField
    • errorLocale

      private Locale errorLocale
      Locale for all translations.
  • Constructor Details

    • CSVParser

      public CSVParser()
      Constructs CSVParser using default values for everything.
    • CSVParser

      CSVParser(char separator, char quotechar, char escape, boolean strictQuotes, boolean ignoreLeadingWhiteSpace, boolean ignoreQuotations, CSVReaderNullFieldIndicator nullFieldIndicator, Locale errorLocale)
      Constructs CSVParser.

      This constructor sets all necessary parameters for CSVParser, and intentionally has package access so only the builder can use it.

      Parameters:
      separator - The delimiter to use for separating entries
      quotechar - The character to use for quoted elements
      escape - The character to use for escaping a separator or quote
      strictQuotes - If true, characters outside the quotes are ignored
      ignoreLeadingWhiteSpace - If true, white space in front of a quote in a field is ignored
      ignoreQuotations - If true, treat quotations like any other character.
      nullFieldIndicator - Which field content will be returned as null: EMPTY_SEPARATORS, EMPTY_QUOTES, BOTH, NEITHER (default)
      errorLocale - Locale for error messages.
  • Method Details

    • getEscape

      public char getEscape()
      Returns:
      The default escape character for this parser.
    • isStrictQuotes

      public boolean isStrictQuotes()
      Returns:
      The default strictQuotes setting for this parser.
    • isIgnoreLeadingWhiteSpace

      public boolean isIgnoreLeadingWhiteSpace()
      Returns:
      The default ignoreLeadingWhiteSpace setting for this parser.
    • isIgnoreQuotations

      public boolean isIgnoreQuotations()
      Returns:
      The default ignoreQuotation setting for this parser.
    • anyCharactersAreTheSame

      private boolean anyCharactersAreTheSame(char separator, char quotechar, char escape)
      Checks to see if any two of the three characters are the same. This is because in opencsv the separator, quote, and escape characters must the different.
      Parameters:
      separator - The defined separator character
      quotechar - The defined quotation cahracter
      escape - The defined escape character
      Returns:
      True if any two of the three are the same.
    • isSameCharacter

      private boolean isSameCharacter(char c1, char c2)
      Checks that the two characters are the same and are not the defined NULL_CHARACTER.
      Parameters:
      c1 - First character
      c2 - Second character
      Returns:
      True if both characters are the same and are not the defined NULL_CHARACTER
    • convertToCsvValue

      protected String convertToCsvValue(String value, boolean applyQuotestoAll)
      Description copied from class: AbstractCSVParser
      Used when reverse parsing an array of strings to a single string. Handles the application of quotes around the string and handling any quotes within the string.
      Specified by:
      convertToCsvValue in class AbstractCSVParser
      Parameters:
      value - String to be converted
      applyQuotestoAll - All values should be surrounded with quotes
      Returns:
      String that will go into the CSV string
    • parseLine

      protected String[] parseLine(String nextLine, boolean multi) throws IOException
      Description copied from class: AbstractCSVParser
      Parses an incoming String and returns an array of elements.
      Specified by:
      parseLine in class AbstractCSVParser
      Parameters:
      nextLine - The string to parse
      multi - Whether it takes multiple lines to form a single record
      Returns:
      The list of elements, or null if nextLine is null
      Throws:
      IOException - If bad things happen during the read
    • handleQuoteCharButNotStrictQuotes

      private void handleQuoteCharButNotStrictQuotes(String nextLine, CSVParser.StringFragmentCopier sfc)
    • handleEscapeCharacter

      private void handleEscapeCharacter(String nextLine, CSVParser.StringFragmentCopier sfc, boolean inQuotes)
    • convertEmptyToNullIfNeeded

      private String convertEmptyToNullIfNeeded(String s, boolean fromQuotedField)
    • shouldConvertEmptyToNull

      private boolean shouldConvertEmptyToNull(boolean fromQuotedField)
    • inQuotes

      private boolean inQuotes(boolean inQuotes)
      Determines if we can process as if we were in quotes.
      Parameters:
      inQuotes - Are we currently in quotes?
      Returns:
      True if we should process as if we are inside quotes.
    • isNextCharacterEscapedQuote

      private boolean isNextCharacterEscapedQuote(String nextLine, boolean inQuotes, int i)
      Checks to see if the character after the index is a quotation character. Precondition: the current character is a quote or an escape.
      Parameters:
      nextLine - The current line
      inQuotes - True if the current context is quoted
      i - Current index in line
      Returns:
      True if the following character is a quote
    • isCharacterQuoteCharacter

      private boolean isCharacterQuoteCharacter(char c)
      Checks to see if the passed in character is the defined quotation character.
      Parameters:
      c - Source character
      Returns:
      True if c is the defined quotation character
    • isCharacterEscapeCharacter

      private boolean isCharacterEscapeCharacter(char c)
      Checks to see if the character is the defined escape character.
      Parameters:
      c - Source character
      Returns:
      True if the character is the defined escape character
    • isCharacterSeparator

      private boolean isCharacterSeparator(char c)
      Checks to see if the character is the defined separator.
      Parameters:
      c - Source character
      Returns:
      True if the character is the defined separator
    • isCharacterEscapable

      private boolean isCharacterEscapable(char c)
      Checks to see if the character passed in could be escapable. Escapable characters for opencsv are the quotation character, the escape character, and the separator.
      Parameters:
      c - Source character
      Returns:
      True if the character could be escapable.
    • isNextCharacterEscapable

      protected boolean isNextCharacterEscapable(String nextLine, boolean inQuotes, int i)
      Checks to see if the character after the current index in a String is an escapable character.

      Meaning the next character is a quotation character, the escape char, or the separator and you are inside quotes.

      "Inside quotes" in this context is interpreted liberally. For instance, if quotes are not expected but we are inside a field, that still counts for the purposes of this method as being "in quotes".

      Precondition: the current character is an escape.
      Parameters:
      nextLine - The current line
      inQuotes - True if the current context is quoted
      i - Current index in line
      Returns:
      True if the following character is a quote
    • setErrorLocale

      public void setErrorLocale(Locale errorLocale)
      Description copied from interface: ICSVParser
      Sets the locale for all error messages.
      Parameters:
      errorLocale - Locale for error messages. If null, the default locale is used.