Class CSVParser
- All Implemented Interfaces:
ICSVParser
A very simple CSV parser released under a commercial-friendly license. This just implements splitting a single line into fields.
The purpose of the CSVParser is to take a single string and parse it into its elements based on the delimiter, quote and escape characters.
The CSVParser has grown organically based on user requests and does not truly match any current requirements (though it can be configured to match or come close). There are no plans to change this as it will break existing requirements. Consider using the RFC4180Parser for less configurability but closer match to the RFC4180 requirements.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static class
This class serves to optimizeAbstractCSVParser.parseLine(java.lang.String)
, which is the hot inner loop of opencsv. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
private Locale
Locale for all translations.private final char
This is the character that the CSVParser will treat as the escape character.private final String
String of escape character - optimization for replaceAllprivate final String
String escapeAsString+escapeAsString - optimization for replaceAllprivate final boolean
Ignore any leading white space at the start of the field.private final boolean
Skip over quotation characters when parsing.private boolean
private final boolean
Determines if the field is between quotes (true) or between separators (false).private int
Fields inherited from class com.opencsv.AbstractCSVParser
EMPTY_STRINGBUILDER, nullFieldIndicator, pending, quotechar, quotecharAsString, quoteDoubledAsString, quoteMatcherPattern, separator, separatorAsString, SPECIAL_REGEX_CHARS
Fields inherited from interface com.opencsv.ICSVParser
DEFAULT_BUNDLE_NAME, DEFAULT_ESCAPE_CHARACTER, DEFAULT_IGNORE_LEADING_WHITESPACE, DEFAULT_IGNORE_QUOTATIONS, DEFAULT_NULL_FIELD_INDICATOR, DEFAULT_QUOTE_CHARACTER, DEFAULT_SEPARATOR, DEFAULT_STRICT_QUOTES, INITIAL_READ_SIZE, MAX_SIZE_FOR_EMPTY_FIELD, NEWLINE, NULL_CHARACTER, READ_BUFFER_SIZE
-
Constructor Summary
ConstructorsConstructorDescriptionConstructs CSVParser using default values for everything.CSVParser
(char separator, char quotechar, char escape, boolean strictQuotes, boolean ignoreLeadingWhiteSpace, boolean ignoreQuotations, CSVReaderNullFieldIndicator nullFieldIndicator, Locale errorLocale) Constructs CSVParser. -
Method Summary
Modifier and TypeMethodDescriptionprivate boolean
anyCharactersAreTheSame
(char separator, char quotechar, char escape) Checks to see if any two of the three characters are the same.private String
convertEmptyToNullIfNeeded
(String s, boolean fromQuotedField) protected String
convertToCsvValue
(String value, boolean applyQuotestoAll) Used when reverse parsing an array of strings to a single string.char
private void
handleEscapeCharacter
(String nextLine, CSVParser.StringFragmentCopier sfc, boolean inQuotes) private void
private boolean
inQuotes
(boolean inQuotes) Determines if we can process as if we were in quotes.private boolean
isCharacterEscapable
(char c) Checks to see if the character passed in could be escapable.private boolean
isCharacterEscapeCharacter
(char c) Checks to see if the character is the defined escape character.private boolean
isCharacterQuoteCharacter
(char c) Checks to see if the passed in character is the defined quotation character.private boolean
isCharacterSeparator
(char c) Checks to see if the character is the defined separator.boolean
boolean
protected boolean
isNextCharacterEscapable
(String nextLine, boolean inQuotes, int i) Checks to see if the character after the current index in a String is an escapable character.private boolean
isNextCharacterEscapedQuote
(String nextLine, boolean inQuotes, int i) Checks to see if the character after the index is a quotation character.private boolean
isSameCharacter
(char c1, char c2) Checks that the two characters are the same and are not the defined NULL_CHARACTER.boolean
protected String[]
Parses an incomingString
and returns an array of elements.void
setErrorLocale
(Locale errorLocale) Sets the locale for all error messages.private boolean
shouldConvertEmptyToNull
(boolean fromQuotedField) Methods inherited from class com.opencsv.AbstractCSVParser
convertToCsvValue, getPendingText, getQuotechar, getQuotecharAsString, getSeparator, getSeparatorAsString, isPending, isSurroundWithQuotes, nullFieldIndicator, parseLine, parseLineMulti, parseToLine, parseToLine
-
Field Details
-
BEGINNING_OF_LINE
private static final int BEGINNING_OF_LINE- See Also:
-
escape
private final char escapeThis is the character that the CSVParser will treat as the escape character. -
escapeAsString
String of escape character - optimization for replaceAll -
escapeDoubleAsString
String escapeAsString+escapeAsString - optimization for replaceAll -
strictQuotes
private final boolean strictQuotesDetermines if the field is between quotes (true) or between separators (false). -
ignoreLeadingWhiteSpace
private final boolean ignoreLeadingWhiteSpaceIgnore any leading white space at the start of the field. -
ignoreQuotations
private final boolean ignoreQuotationsSkip over quotation characters when parsing. -
tokensOnLastCompleteLine
private int tokensOnLastCompleteLine -
inField
private boolean inField -
errorLocale
Locale for all translations.
-
-
Constructor Details
-
CSVParser
public CSVParser()Constructs CSVParser using default values for everything. -
CSVParser
CSVParser(char separator, char quotechar, char escape, boolean strictQuotes, boolean ignoreLeadingWhiteSpace, boolean ignoreQuotations, CSVReaderNullFieldIndicator nullFieldIndicator, Locale errorLocale) Constructs CSVParser.This constructor sets all necessary parameters for CSVParser, and intentionally has package access so only the builder can use it.
- Parameters:
separator
- The delimiter to use for separating entriesquotechar
- The character to use for quoted elementsescape
- The character to use for escaping a separator or quotestrictQuotes
- If true, characters outside the quotes are ignoredignoreLeadingWhiteSpace
- If true, white space in front of a quote in a field is ignoredignoreQuotations
- If true, treat quotations like any other character.nullFieldIndicator
- Which field content will be returned as null: EMPTY_SEPARATORS, EMPTY_QUOTES, BOTH, NEITHER (default)errorLocale
- Locale for error messages.
-
-
Method Details
-
getEscape
public char getEscape()- Returns:
- The default escape character for this parser.
-
isStrictQuotes
public boolean isStrictQuotes()- Returns:
- The default strictQuotes setting for this parser.
-
isIgnoreLeadingWhiteSpace
public boolean isIgnoreLeadingWhiteSpace()- Returns:
- The default ignoreLeadingWhiteSpace setting for this parser.
-
isIgnoreQuotations
public boolean isIgnoreQuotations()- Returns:
- The default ignoreQuotation setting for this parser.
-
anyCharactersAreTheSame
private boolean anyCharactersAreTheSame(char separator, char quotechar, char escape) Checks to see if any two of the three characters are the same. This is because in opencsv the separator, quote, and escape characters must the different.- Parameters:
separator
- The defined separator characterquotechar
- The defined quotation cahracterescape
- The defined escape character- Returns:
- True if any two of the three are the same.
-
isSameCharacter
private boolean isSameCharacter(char c1, char c2) Checks that the two characters are the same and are not the defined NULL_CHARACTER.- Parameters:
c1
- First characterc2
- Second character- Returns:
- True if both characters are the same and are not the defined NULL_CHARACTER
-
convertToCsvValue
Description copied from class:AbstractCSVParser
Used when reverse parsing an array of strings to a single string. Handles the application of quotes around the string and handling any quotes within the string.- Specified by:
convertToCsvValue
in classAbstractCSVParser
- Parameters:
value
- String to be convertedapplyQuotestoAll
- All values should be surrounded with quotes- Returns:
- String that will go into the CSV string
-
parseLine
Description copied from class:AbstractCSVParser
Parses an incomingString
and returns an array of elements.- Specified by:
parseLine
in classAbstractCSVParser
- Parameters:
nextLine
- The string to parsemulti
- Whether it takes multiple lines to form a single record- Returns:
- The list of elements, or
null
ifnextLine
isnull
- Throws:
IOException
- If bad things happen during the read
-
handleQuoteCharButNotStrictQuotes
-
handleEscapeCharacter
private void handleEscapeCharacter(String nextLine, CSVParser.StringFragmentCopier sfc, boolean inQuotes) -
convertEmptyToNullIfNeeded
-
shouldConvertEmptyToNull
private boolean shouldConvertEmptyToNull(boolean fromQuotedField) -
inQuotes
private boolean inQuotes(boolean inQuotes) Determines if we can process as if we were in quotes.- Parameters:
inQuotes
- Are we currently in quotes?- Returns:
- True if we should process as if we are inside quotes.
-
isNextCharacterEscapedQuote
Checks to see if the character after the index is a quotation character. Precondition: the current character is a quote or an escape.- Parameters:
nextLine
- The current lineinQuotes
- True if the current context is quotedi
- Current index in line- Returns:
- True if the following character is a quote
-
isCharacterQuoteCharacter
private boolean isCharacterQuoteCharacter(char c) Checks to see if the passed in character is the defined quotation character.- Parameters:
c
- Source character- Returns:
- True if c is the defined quotation character
-
isCharacterEscapeCharacter
private boolean isCharacterEscapeCharacter(char c) Checks to see if the character is the defined escape character.- Parameters:
c
- Source character- Returns:
- True if the character is the defined escape character
-
isCharacterSeparator
private boolean isCharacterSeparator(char c) Checks to see if the character is the defined separator.- Parameters:
c
- Source character- Returns:
- True if the character is the defined separator
-
isCharacterEscapable
private boolean isCharacterEscapable(char c) Checks to see if the character passed in could be escapable. Escapable characters for opencsv are the quotation character, the escape character, and the separator.- Parameters:
c
- Source character- Returns:
- True if the character could be escapable.
-
isNextCharacterEscapable
Checks to see if the character after the current index in a String is an escapable character.Meaning the next character is a quotation character, the escape char, or the separator and you are inside quotes.
"Inside quotes" in this context is interpreted liberally. For instance, if quotes are not expected but we are inside a field, that still counts for the purposes of this method as being "in quotes".
Precondition: the current character is an escape.- Parameters:
nextLine
- The current lineinQuotes
- True if the current context is quotedi
- Current index in line- Returns:
- True if the following character is a quote
-
setErrorLocale
Description copied from interface:ICSVParser
Sets the locale for all error messages.- Parameters:
errorLocale
- Locale for error messages. If null, the default locale is used.
-