Class CsvEscape


  • public final class CsvEscape
    extends java.lang.Object

    Utility class for performing CSV escape/unescape operations.

    Features

    Specific features of the CSV escape/unescape operations performed by means of this class:

    • Works according to the rules specified in RFC4180 (there is no CSV standard as such).
    • Encloses escaped values in double-quotes ("value") if they contain any non-alphanumeric characters.
    • Escapes double-quote characters (") by writing them twice: "".
    Input/Output

    There are four different input/output modes that can be used in escape/unescape operations:

    • String input, String output: Input is specified as a String object and output is returned as another. In order to improve memory performance, all escape and unescape operations will return the exact same input object as output if no escape/unescape modifications are required.
    • String input, java.io.Writer output: Input will be read from a String and output will be written into the specified java.io.Writer.
    • java.io.Reader input, java.io.Writer output: Input will be read from a Reader and output will be written into the specified java.io.Writer.
    • char[] input, java.io.Writer output: Input will be read from a char array (char[]) and output will be written into the specified java.io.Writer. Two int arguments called offset and len will be used for specifying the part of the char[] that should be escaped/unescaped. These methods should be called with offset = 0 and len = text.length in order to process the whole char[].
    Specific instructions for Microsoft Excel-compatible files

    In order for Microsoft Excel to correcly open a CSV file —including field values with line breaks— these rules should be followed:

    • Separate fields with comma (,) in English-language setups, and semi-colon (;) in non-English-language setups (this depends on the language of the installation of MS Excel you intend your files to be open in).
    • Separate records with Windows-style line breaks (\r\n, U+000D + U+000A).
    • Enclose field values in double-quotes (") if they contain any non-alphanumeric characters.
    • Don't leave any whitespace between the field separator (;) and the enclosing quotes (").
    • Escape double-quote characters (") inside field values with two double-quotes ("").
    • Use \n (U+000A, unix-style line breaks) for line breaks inside field values, even if records are separated with Windows-style line breaks (\r\n) [ EXCEL 2003 compatibility ].
    • Open CSV files in Excel with File -> Open..., not with Data -> Import... The latter option will not correctly understand line breaks inside field values (up to Excel 2010).

    (Note unbescape will perform escaping of field values only, so it will take care of enclosing in double-quotes, using unix-style line breaks inside values, etc. But separating fields (e.g. with ;), delimiting records (e.g. with \r\n) and using the correct character encoding when writing CSV files will be the responsibility of the application calling unbescape.)

    The described format for Excel is also supported by OpenOffice.org Calc (File -> Open...) and also Google Spreadsheets (File -> Import...)

    References

    The following references apply:

    Since:
    1.0.0
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      private CsvEscape()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static void escapeCsv​(char[] text, int offset, int len, java.io.Writer writer)
      Perform a CSV escape operation on a char[] input.
      static void escapeCsv​(java.io.Reader reader, java.io.Writer writer)
      Perform a CSV escape operation on a Reader input, writing results to a Writer.
      static java.lang.String escapeCsv​(java.lang.String text)
      Perform a CSV escape operation on a String input.
      static void escapeCsv​(java.lang.String text, java.io.Writer writer)
      Perform a CSV escape operation on a String input, writing results to a Writer.
      static void unescapeCsv​(char[] text, int offset, int len, java.io.Writer writer)
      Perform a CSV unescape operation on a char[] input.
      static void unescapeCsv​(java.io.Reader reader, java.io.Writer writer)
      Perform a CSV unescape operation on a Reader input, writing results to a Writer.
      static java.lang.String unescapeCsv​(java.lang.String text)
      Perform a CSV unescape operation on a String input.
      static void unescapeCsv​(java.lang.String text, java.io.Writer writer)
      Perform a CSV unescape operation on a String input, writing results to a Writer.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CsvEscape

        private CsvEscape()
    • Method Detail

      • escapeCsv

        public static java.lang.String escapeCsv​(java.lang.String text)

        Perform a CSV escape operation on a String input.

        This method is thread-safe.

        Parameters:
        text - the String to be escaped.
        Returns:
        The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if input is null.
      • escapeCsv

        public static void escapeCsv​(java.lang.String text,
                                     java.io.Writer writer)
                              throws java.io.IOException

        Perform a CSV escape operation on a String input, writing results to a Writer.

        This method is thread-safe.

        Parameters:
        text - the String to be escaped.
        writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if input is null.
        Throws:
        java.io.IOException - if an input/output exception occurs
        Since:
        1.1.2
      • escapeCsv

        public static void escapeCsv​(java.io.Reader reader,
                                     java.io.Writer writer)
                              throws java.io.IOException

        Perform a CSV escape operation on a Reader input, writing results to a Writer.

        This method is thread-safe.

        Parameters:
        reader - the Reader reading the text to be escaped.
        writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if input is null.
        Throws:
        java.io.IOException - if an input/output exception occurs
        Since:
        1.1.2
      • escapeCsv

        public static void escapeCsv​(char[] text,
                                     int offset,
                                     int len,
                                     java.io.Writer writer)
                              throws java.io.IOException

        Perform a CSV escape operation on a char[] input.

        This method is thread-safe.

        Parameters:
        text - the char[] to be escaped.
        offset - the position in text at which the escape operation should start.
        len - the number of characters in text that should be escaped.
        writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if input is null.
        Throws:
        java.io.IOException - if an input/output exception occurs
      • unescapeCsv

        public static java.lang.String unescapeCsv​(java.lang.String text)

        Perform a CSV unescape operation on a String input.

        This method is thread-safe.

        Parameters:
        text - the String to be unescaped.
        Returns:
        The unescaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no unescaping modifications were required (and no additional String objects will be created during processing). Will return null if input is null.
      • unescapeCsv

        public static void unescapeCsv​(java.lang.String text,
                                       java.io.Writer writer)
                                throws java.io.IOException

        Perform a CSV unescape operation on a String input, writing results to a Writer.

        This method is thread-safe.

        Parameters:
        text - the String to be unescaped.
        writer - the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if input is null.
        Throws:
        java.io.IOException - if an input/output exception occurs
        Since:
        1.1.2
      • unescapeCsv

        public static void unescapeCsv​(java.io.Reader reader,
                                       java.io.Writer writer)
                                throws java.io.IOException

        Perform a CSV unescape operation on a Reader input, writing results to a Writer.

        This method is thread-safe.

        Parameters:
        reader - the Reader reading the text to be unescaped.
        writer - the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if input is null.
        Throws:
        java.io.IOException - if an input/output exception occurs
        Since:
        1.1.2
      • unescapeCsv

        public static void unescapeCsv​(char[] text,
                                       int offset,
                                       int len,
                                       java.io.Writer writer)
                                throws java.io.IOException

        Perform a CSV unescape operation on a char[] input.

        This method is thread-safe.

        Parameters:
        text - the char[] to be unescaped.
        offset - the position in text at which the unescape operation should start.
        len - the number of characters in text that should be unescaped.
        writer - the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if input is null.
        Throws:
        java.io.IOException - if an input/output exception occurs