Class BomUtil

java.lang.Object
de.siegmar.fastcsv.reader.BomUtil

final class BomUtil extends Object
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    (package private) static final int
    The maximum number of bytes a BOM header can have.
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    private
     
  • Method Summary

    Modifier and Type
    Method
    Description
    (package private) static Optional<BomHeader>
    detectCharset(byte[] buf)
    Detects the character encoding of a byte array based on the presence of a Byte Order Mark (BOM) header.
    (package private) static Optional<BomHeader>
    Detects the character encoding of a file based on the presence of a Byte Order Mark (BOM) header.
    (package private) static Reader
    openReader(Path file, Charset defaultCharset)
    Opens a Reader for the given file, skipping a BOM header if present.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • POTENTIAL_BOM_SIZE

      static final int POTENTIAL_BOM_SIZE
      The maximum number of bytes a BOM header can have.
      See Also:
  • Constructor Details

    • BomUtil

      private BomUtil()
  • Method Details

    • detectCharset

      static Optional<BomHeader> detectCharset(byte[] buf)
      Detects the character encoding of a byte array based on the presence of a Byte Order Mark (BOM) header. The method supports the following BOM headers:
      • UTF-8 : EF BB BF
      • UTF-16 BE: FE FF
      • UTF-16 LE: FF FE
      • UTF-32 BE: 00 00 FE FF
      • UTF-32 LE: FF FE 00 00

      See Byte order mark

      Parameters:
      buf - the byte array to detect the character encoding from
      Returns:
      an Optional containing the detected BomHeader if a BOM header is found, or an empty Optional if no BOM header is found
    • detectCharset

      static Optional<BomHeader> detectCharset(Path file) throws IOException
      Detects the character encoding of a file based on the presence of a Byte Order Mark (BOM) header.
      Parameters:
      file - the file to detect the character encoding from
      Returns:
      an Optional containing the detected BomHeader if a BOM header is found, or an empty Optional if no BOM header is found
      Throws:
      IOException - if an I/O error occurs reading the file
    • openReader

      static Reader openReader(Path file, Charset defaultCharset) throws IOException
      Opens a Reader for the given file, skipping a BOM header if present. If no BOM header is present, the defaultCharset is used.
      Parameters:
      file - the file to open a Reader for
      defaultCharset - the default charset to use if no BOM header is present
      Returns:
      a Reader for the given file
      Throws:
      IOException - if an I/O error occurs opening the file