Package com.opencsv

Class CSVParser

  • All Implemented Interfaces:
    ICSVParser

    public class CSVParser
    extends AbstractCSVParser

    A very simple CSV parser released under a commercial-friendly license. This just implements splitting a single line into fields.

    The purpose of the CSVParser is to take a single string and parse it into its elements based on the delimiter, quote and escape characters.

    The CSVParser has grown organically based on user requests and does not truly match any current requirements (though it can be configured to match or come close). There are no plans to change this as it will break existing requirements. Consider using the RFC4180Parser for less configurability but closer match to the RFC4180 requirements.

    • Field Detail

      • escape

        private final char escape
        This is the character that the CSVParser will treat as the escape character.
      • escapeAsString

        private final java.lang.String escapeAsString
        String of escape character - optimization for replaceAll
      • escapeDoubleAsString

        private final java.lang.String escapeDoubleAsString
        String escapeAsString+escapeAsString - optimization for replaceAll
      • strictQuotes

        private final boolean strictQuotes
        Determines if the field is between quotes (true) or between separators (false).
      • ignoreLeadingWhiteSpace

        private final boolean ignoreLeadingWhiteSpace
        Ignore any leading white space at the start of the field.
      • ignoreQuotations

        private final boolean ignoreQuotations
        Skip over quotation characters when parsing.
      • tokensOnLastCompleteLine

        private int tokensOnLastCompleteLine
      • inField

        private boolean inField
      • errorLocale

        private java.util.Locale errorLocale
        Locale for all translations.
    • Constructor Detail

      • CSVParser

        public CSVParser()
        Constructs CSVParser using default values for everything.
      • CSVParser

        CSVParser​(char separator,
                  char quotechar,
                  char escape,
                  boolean strictQuotes,
                  boolean ignoreLeadingWhiteSpace,
                  boolean ignoreQuotations,
                  CSVReaderNullFieldIndicator nullFieldIndicator,
                  java.util.Locale errorLocale)
        Constructs CSVParser.

        This constructor sets all necessary parameters for CSVParser, and intentionally has package access so only the builder can use it.

        Parameters:
        separator - The delimiter to use for separating entries
        quotechar - The character to use for quoted elements
        escape - The character to use for escaping a separator or quote
        strictQuotes - If true, characters outside the quotes are ignored
        ignoreLeadingWhiteSpace - If true, white space in front of a quote in a field is ignored
        ignoreQuotations - If true, treat quotations like any other character.
        nullFieldIndicator - Which field content will be returned as null: EMPTY_SEPARATORS, EMPTY_QUOTES, BOTH, NEITHER (default)
        errorLocale - Locale for error messages.
    • Method Detail

      • getEscape

        public char getEscape()
        Returns:
        The default escape character for this parser.
      • isStrictQuotes

        public boolean isStrictQuotes()
        Returns:
        The default strictQuotes setting for this parser.
      • isIgnoreLeadingWhiteSpace

        public boolean isIgnoreLeadingWhiteSpace()
        Returns:
        The default ignoreLeadingWhiteSpace setting for this parser.
      • isIgnoreQuotations

        public boolean isIgnoreQuotations()
        Returns:
        The default ignoreQuotation setting for this parser.
      • anyCharactersAreTheSame

        private boolean anyCharactersAreTheSame​(char separator,
                                                char quotechar,
                                                char escape)
        Checks to see if any two of the three characters are the same. This is because in opencsv the separator, quote, and escape characters must the different.
        Parameters:
        separator - The defined separator character
        quotechar - The defined quotation cahracter
        escape - The defined escape character
        Returns:
        True if any two of the three are the same.
      • isSameCharacter

        private boolean isSameCharacter​(char c1,
                                        char c2)
        Checks that the two characters are the same and are not the defined NULL_CHARACTER.
        Parameters:
        c1 - First character
        c2 - Second character
        Returns:
        True if both characters are the same and are not the defined NULL_CHARACTER
      • convertToCsvValue

        protected java.lang.String convertToCsvValue​(java.lang.String value,
                                                     boolean applyQuotestoAll)
        Description copied from class: AbstractCSVParser
        Used when reverse parsing an array of strings to a single string. Handles the application of quotes around the string and handling any quotes within the string.
        Specified by:
        convertToCsvValue in class AbstractCSVParser
        Parameters:
        value - String to be converted
        applyQuotestoAll - All values should be surrounded with quotes
        Returns:
        String that will go into the CSV string
      • parseLine

        protected java.lang.String[] parseLine​(java.lang.String nextLine,
                                               boolean multi)
                                        throws java.io.IOException
        Description copied from class: AbstractCSVParser
        Parses an incoming String and returns an array of elements.
        Specified by:
        parseLine in class AbstractCSVParser
        Parameters:
        nextLine - The string to parse
        multi - Whether it takes multiple lines to form a single record
        Returns:
        The list of elements, or null if nextLine is null
        Throws:
        java.io.IOException - If bad things happen during the read
      • handleQuoteCharButNotStrictQuotes

        private void handleQuoteCharButNotStrictQuotes​(java.lang.String nextLine,
                                                       CSVParser.StringFragmentCopier sfc)
      • convertEmptyToNullIfNeeded

        private java.lang.String convertEmptyToNullIfNeeded​(java.lang.String s,
                                                            boolean fromQuotedField)
      • shouldConvertEmptyToNull

        private boolean shouldConvertEmptyToNull​(boolean fromQuotedField)
      • inQuotes

        private boolean inQuotes​(boolean inQuotes)
        Determines if we can process as if we were in quotes.
        Parameters:
        inQuotes - Are we currently in quotes?
        Returns:
        True if we should process as if we are inside quotes.
      • isNextCharacterEscapedQuote

        private boolean isNextCharacterEscapedQuote​(java.lang.String nextLine,
                                                    boolean inQuotes,
                                                    int i)
        Checks to see if the character after the index is a quotation character. Precondition: the current character is a quote or an escape.
        Parameters:
        nextLine - The current line
        inQuotes - True if the current context is quoted
        i - Current index in line
        Returns:
        True if the following character is a quote
      • isCharacterQuoteCharacter

        private boolean isCharacterQuoteCharacter​(char c)
        Checks to see if the passed in character is the defined quotation character.
        Parameters:
        c - Source character
        Returns:
        True if c is the defined quotation character
      • isCharacterEscapeCharacter

        private boolean isCharacterEscapeCharacter​(char c)
        Checks to see if the character is the defined escape character.
        Parameters:
        c - Source character
        Returns:
        True if the character is the defined escape character
      • isCharacterSeparator

        private boolean isCharacterSeparator​(char c)
        Checks to see if the character is the defined separator.
        Parameters:
        c - Source character
        Returns:
        True if the character is the defined separator
      • isCharacterEscapable

        private boolean isCharacterEscapable​(char c)
        Checks to see if the character passed in could be escapable. Escapable characters for opencsv are the quotation character, the escape character, and the separator.
        Parameters:
        c - Source character
        Returns:
        True if the character could be escapable.
      • isNextCharacterEscapable

        protected boolean isNextCharacterEscapable​(java.lang.String nextLine,
                                                   boolean inQuotes,
                                                   int i)
        Checks to see if the character after the current index in a String is an escapable character.

        Meaning the next character is a quotation character, the escape char, or the separator and you are inside quotes.

        "Inside quotes" in this context is interpreted liberally. For instance, if quotes are not expected but we are inside a field, that still counts for the purposes of this method as being "in quotes".

        Precondition: the current character is an escape.
        Parameters:
        nextLine - The current line
        inQuotes - True if the current context is quoted
        i - Current index in line
        Returns:
        True if the following character is a quote
      • setErrorLocale

        public void setErrorLocale​(java.util.Locale errorLocale)
        Description copied from interface: ICSVParser
        Sets the locale for all error messages.
        Parameters:
        errorLocale - Locale for error messages. If null, the default locale is used.