Class CMapAwareDocumentFont


  • public class CMapAwareDocumentFont
    extends DocumentFont
    Implementation of DocumentFont used while parsing PDF streams.
    Since:
    2.1.4
    • Field Detail

      • LOGGER

        private static final Logger LOGGER
      • spaceWidth

        private int spaceWidth
        the width of a space for this font, in normalized 1000 point units
      • toUnicodeCmap

        private CMapToUnicode toUnicodeCmap
        The CMap constructed from the ToUnicode map from the font's dictionary, if present. This CMap transforms CID values into unicode equivalent
      • cidbyte2uni

        private char[] cidbyte2uni
        Mapping between CID code (single byte only for now) and unicode equivalent as derived by the font's encoding. Only needed if the ToUnicode CMap is not provided.
      • uni2cid

        private java.util.Map<java.lang.Integer,​java.lang.Integer> uni2cid
    • Constructor Detail

      • CMapAwareDocumentFont

        public CMapAwareDocumentFont​(PdfDictionary font)
      • CMapAwareDocumentFont

        public CMapAwareDocumentFont​(PRIndirectReference refFont)
        Creates an instance of a CMapAwareFont based on an indirect reference to a font.
        Parameters:
        refFont - the indirect reference to a font
    • Method Detail

      • initFont

        private void initFont()
      • processToUnicode

        private void processToUnicode()
        Parses the ToUnicode entry, if present, and constructs a CMap for it
        Since:
        2.1.7
      • processUni2Byte

        private void processUni2Byte()
                              throws java.io.IOException
        Inverts DocumentFont's uni2byte mapping to obtain a cid-to-unicode mapping based on the font's encoding
        Throws:
        java.io.IOException
        Since:
        2.1.7
      • computeAverageWidth

        private int computeAverageWidth()
        For all widths of all glyphs, compute the average width in normalized 1000 point units. This is used to give some meaningful width in cases where we need an average font width (such as if the width of a space isn't specified by a given font)
        Returns:
        the average width of all non-zero width glyphs in the font
      • getWidth

        public int getWidth​(int char1)
        Description copied from class: DocumentFont
        Gets the width of a char in normalized 1000 units.
        Overrides:
        getWidth in class DocumentFont
        Parameters:
        char1 - the unicode char to get the width of
        Returns:
        the width in normalized 1000 units
        Since:
        2.1.5 Override to allow special handling for fonts that don't specify width of space character
        See Also:
        DocumentFont.getWidth(int)
      • decodeSingleCID

        private java.lang.String decodeSingleCID​(byte[] bytes,
                                                 int offset,
                                                 int len)
        Decodes a single CID (represented by one or two bytes) to a unicode String.
        Parameters:
        bytes - the bytes making up the character code to convert
        offset - an offset
        len - a length
        Returns:
        a String containing the encoded form of the input bytes using the font's encoding.
      • decode

        public java.lang.String decode​(byte[] cidbytes,
                                       int offset,
                                       int len)
        Decodes a string of bytes (encoded in the font's encoding) into a unicode string This will use the ToUnicode map of the font, if available, otherwise it uses the font's encoding
        Parameters:
        cidbytes - the bytes that need to be decoded
        Returns:
        the unicode String that results from decoding
        Since:
        2.1.7
      • encode

        public java.lang.String encode​(byte[] bytes,
                                       int offset,
                                       int len)
        Deprecated.
        method name is not indicative of what it does. Use decode instead.
        Encodes bytes to a String.
        Parameters:
        bytes - the bytes from a stream
        offset - an offset
        len - a length
        Returns:
        a String encoded taking into account if the bytes are in unicode or not.