Class PDFObject


  • public class PDFObject
    extends java.lang.Object
    a class encapsulating all the possibilities of content for an object in a PDF file.

    A PDF object can be a simple type, like a Boolean, a Number, a String, or the Null value. It can also be a NAME, which looks like a string, but is a special type in PDF files, like "/Name".

    A PDF object can also be complex types, including Array; Dictionary; Stream, which is a Dictionary plus an array of bytes; or Indirect, which is a reference to some other PDF object. Indirect references will always be dereferenced by the time any data is returned from one of the methods in this class.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int ARRAY
      an array of PDFObjects
      static int BOOLEAN
      a Boolean
      private java.lang.ref.SoftReference cache
      a cache of translated data.
      private java.lang.ref.SoftReference decodedStream
      a cached version of the decoded stream
      static int DICTIONARY
      a Hashmap that maps String names to PDFObjects
      static int INDIRECT
      an indirect reference
      static int KEYWORD
      a special PDF bare word, like R, obj, true, false, etc
      static int NAME
      a special string, seen in PDF files as /Name
      static int NULL
      the NULL object (there is only one)
      static PDFObject nullObj
      the NULL PDFObject
      static int NUMBER
      a Number, represented as a double
      static int OBJ_NUM_EMBEDDED
      When a value of objNum or objGen, indicates that the object is not top-level, and is embedded in another object
      static int OBJ_NUM_TRAILER
      When a value of objNum or objGen, indicates that the object is not top-level, and is embedded directly in the trailer.
      private int objGen  
      private int objNum  
      private PDFFile owner
      the PDFFile from which this object came, used for dereferences
      private java.nio.ByteBuffer stream
      the encoded stream, if this is a STREAM object
      static int STREAM
      a Stream: a Hashmap with a byte array
      static int STRING
      a String
      private int type
      the type of this object
      private java.lang.Object value
      the value of this object.
    • Constructor Summary

      Constructors 
      Constructor Description
      PDFObject​(PDFFile owner, int type, java.lang.Object value)
      create a new simple PDFObject with a type and a value
      PDFObject​(PDFFile owner, PDFXref xref)
      create a new PDFObject based on a PDFXref
      PDFObject​(java.lang.Object obj)
      create a new PDFObject that is the closest match to a given Java object.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private java.nio.ByteBuffer decodeStream()
      Get the decoded stream value
      PDFObject dereference()
      Make sure that this object is dereferenced.
      boolean equals​(java.lang.Object o)
      Test whether two PDFObject are equal.
      PDFObject[] getArray()
      get the value as a PDFObject[].
      PDFObject getAt​(int idx)
      if this object is an ARRAY, get the PDFObject at some position in the array.
      boolean getBooleanValue()
      get the value as a boolean.
      java.lang.Object getCache()
      get the value in the cache.
      PDFDecrypter getDecrypter()  
      java.util.HashMap<java.lang.String,​PDFObject> getDictionary()
      get the dictionary as a HashMap.
      java.util.Iterator getDictKeys()
      get an Iterator over all the keys in the dictionary.
      PDFObject getDictRef​(java.lang.String key)
      get the value associated with a particular key in the dictionary.
      double getDoubleValue()
      get the value as a double.
      float getFloatValue()
      get the value as a float.
      int getIntValue()
      get the value as an int.
      int getObjGen()
      Get the object generation number of this object; a negative value indicates that the object is not numbered, as it's not a top-level object: if the value is OBJ_NUM_EMBEDDED, it is because it's embedded within another object.
      int getObjNum()
      Get the object number of this object; a negative value indicates that the object is not numbered, as it's not a top-level object: if the value is OBJ_NUM_EMBEDDED, it is because it's embedded within another object.
      byte[] getStream()
      get the stream from this object.
      java.nio.ByteBuffer getStreamBuffer()
      get the stream from this object as a byte buffer.
      java.lang.String getStringValue()
      get the value as a String.
      java.lang.String getTextStringValue()
      Get the value as a text string; i.e., a string encoded in UTF-16BE or PDFDocEncoding.
      int getType()
      get the type of this object.
      boolean isDictType​(java.lang.String match)
      returns true only if this object is a DICTIONARY or a STREAM, and the "Type" entry in the dictionary matches a given value.
      boolean isIndirect()
      Identify whether the object is currently an indirect/cross-reference
      void setCache​(java.lang.Object obj)
      set the cached value.
      void setObjectId​(int objNum, int objGen)
      Set the object identifiers
      void setStream​(java.nio.ByteBuffer data)
      set the stream of this object.
      java.lang.String toString()
      return a representation of this PDFObject as a String.
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • NUMBER

        public static final int NUMBER
        a Number, represented as a double
        See Also:
        Constant Field Values
      • NAME

        public static final int NAME
        a special string, seen in PDF files as /Name
        See Also:
        Constant Field Values
      • DICTIONARY

        public static final int DICTIONARY
        a Hashmap that maps String names to PDFObjects
        See Also:
        Constant Field Values
      • STREAM

        public static final int STREAM
        a Stream: a Hashmap with a byte array
        See Also:
        Constant Field Values
      • KEYWORD

        public static final int KEYWORD
        a special PDF bare word, like R, obj, true, false, etc
        See Also:
        Constant Field Values
      • OBJ_NUM_EMBEDDED

        public static final int OBJ_NUM_EMBEDDED
        When a value of objNum or objGen, indicates that the object is not top-level, and is embedded in another object
        See Also:
        Constant Field Values
      • OBJ_NUM_TRAILER

        public static final int OBJ_NUM_TRAILER
        When a value of objNum or objGen, indicates that the object is not top-level, and is embedded directly in the trailer.
        See Also:
        Constant Field Values
      • nullObj

        public static final PDFObject nullObj
        the NULL PDFObject
      • type

        private int type
        the type of this object
      • value

        private java.lang.Object value
        the value of this object. It can be a wide number of things, defined by type
      • stream

        private java.nio.ByteBuffer stream
        the encoded stream, if this is a STREAM object
      • decodedStream

        private java.lang.ref.SoftReference decodedStream
        a cached version of the decoded stream
      • owner

        private PDFFile owner
        the PDFFile from which this object came, used for dereferences
      • cache

        private java.lang.ref.SoftReference cache
        a cache of translated data. This data can be garbage collected at any time, after which it will have to be rebuilt.
    • Constructor Detail

      • PDFObject

        public PDFObject​(PDFFile owner,
                         int type,
                         java.lang.Object value)
        create a new simple PDFObject with a type and a value
        Parameters:
        owner - the PDFFile in which this object resides, used for dereferencing. This may be null.
        type - the type of object
        value - the value. For DICTIONARY, this is a HashMap. for ARRAY it's an ArrayList. For NUMBER, it's a Double. for BOOLEAN, it's Boolean.TRUE or Boolean.FALSE. For everything else, it's a String.
      • PDFObject

        public PDFObject​(java.lang.Object obj)
                  throws PDFParseException
        create a new PDFObject that is the closest match to a given Java object. Possibilities include Double, String, PDFObject[], HashMap, Boolean, or PDFParser.Tok, which should be "true" or "false" to turn into a BOOLEAN.
        Parameters:
        obj - the sample Java object to convert to a PDFObject.
        Throws:
        PDFParseException - if the object isn't one of the above examples, and can't be turned into a PDFObject.
      • PDFObject

        public PDFObject​(PDFFile owner,
                         PDFXref xref)
        create a new PDFObject based on a PDFXref
        Parameters:
        owner - the PDFFile from which the PDFXref was drawn
        xref - the PDFXref to turn into a PDFObject
    • Method Detail

      • getType

        public int getType()
                    throws java.io.IOException
        get the type of this object. The object will be dereferenced, so INDIRECT will never be returned.
        Returns:
        the type of the object
        Throws:
        java.io.IOException
      • setStream

        public void setStream​(java.nio.ByteBuffer data)
        set the stream of this object. It should have been a DICTIONARY before the call.
        Parameters:
        data - the data, as a ByteBuffer.
      • getCache

        public java.lang.Object getCache()
                                  throws java.io.IOException
        get the value in the cache. May become null at any time.
        Returns:
        the cached value, or null if the value has been garbage collected.
        Throws:
        java.io.IOException
      • setCache

        public void setCache​(java.lang.Object obj)
                      throws java.io.IOException
        set the cached value. The object may be garbage collected if no other reference exists to it.
        Parameters:
        obj - the object to be cached
        Throws:
        java.io.IOException
      • getStream

        public byte[] getStream()
                         throws java.io.IOException
        get the stream from this object. Will return null if this object isn't a STREAM.
        Returns:
        the stream, or null, if this isn't a STREAM.
        Throws:
        java.io.IOException
      • getStreamBuffer

        public java.nio.ByteBuffer getStreamBuffer()
                                            throws java.io.IOException
        get the stream from this object as a byte buffer. Will return null if this object isn't a STREAM.
        Returns:
        the buffer, or null, if this isn't a STREAM.
        Throws:
        java.io.IOException
      • decodeStream

        private java.nio.ByteBuffer decodeStream()
                                          throws java.io.IOException
        Get the decoded stream value
        Throws:
        java.io.IOException
      • getIntValue

        public int getIntValue()
                        throws java.io.IOException
        get the value as an int. Will return 0 if this object isn't a NUMBER.
        Throws:
        java.io.IOException
      • getFloatValue

        public float getFloatValue()
                            throws java.io.IOException
        get the value as a float. Will return 0 if this object isn't a NUMBER
        Throws:
        java.io.IOException
      • getDoubleValue

        public double getDoubleValue()
                              throws java.io.IOException
        get the value as a double. Will return 0 if this object isn't a NUMBER.
        Throws:
        java.io.IOException
      • getStringValue

        public java.lang.String getStringValue()
                                        throws java.io.IOException
        get the value as a String. Will return null if the object isn't a STRING, NAME, or KEYWORD. This method will NOT convert a NUMBER to a String. If the string is actually a text string (i.e., may be encoded in UTF16-BE or PdfDocEncoding), then one should use getTextStringValue() or use one of the PDFStringUtil methods on the result from this method. The string value represents exactly the sequence of 8 bit characters present in the file, decrypted and decoded as appropriate, into a string containing only 8 bit character values - that is, each char will be between 0 and 255.
        Throws:
        java.io.IOException
      • getTextStringValue

        public java.lang.String getTextStringValue()
                                            throws java.io.IOException
        Get the value as a text string; i.e., a string encoded in UTF-16BE or PDFDocEncoding. Simple latin alpha-numeric characters are preserved in both these encodings.
        Returns:
        the text string value
        Throws:
        java.io.IOException
      • getArray

        public PDFObject[] getArray()
                             throws java.io.IOException
        get the value as a PDFObject[]. If this object is an ARRAY, will return the array. Otherwise, will return an array of one element with this object as the element.
        Throws:
        java.io.IOException
      • getBooleanValue

        public boolean getBooleanValue()
                                throws java.io.IOException
        get the value as a boolean. Will return false if this object is not a BOOLEAN
        Throws:
        java.io.IOException
      • getAt

        public PDFObject getAt​(int idx)
                        throws java.io.IOException
        if this object is an ARRAY, get the PDFObject at some position in the array. If this is not an ARRAY, returns null.
        Throws:
        java.io.IOException
      • getDictKeys

        public java.util.Iterator getDictKeys()
                                       throws java.io.IOException
        get an Iterator over all the keys in the dictionary. If this object is not a DICTIONARY or a STREAM, returns an Iterator over the empty list.
        Throws:
        java.io.IOException
      • getDictionary

        public java.util.HashMap<java.lang.String,​PDFObject> getDictionary()
                                                                          throws java.io.IOException
        get the dictionary as a HashMap. If this isn't a DICTIONARY or a STREAM, returns null
        Throws:
        java.io.IOException
      • getDictRef

        public PDFObject getDictRef​(java.lang.String key)
                             throws java.io.IOException
        get the value associated with a particular key in the dictionary. If this isn't a DICTIONARY or a STREAM, or there is no such key, returns null.
        Throws:
        java.io.IOException
      • isDictType

        public boolean isDictType​(java.lang.String match)
                           throws java.io.IOException
        returns true only if this object is a DICTIONARY or a STREAM, and the "Type" entry in the dictionary matches a given value.
        Parameters:
        match - the expected value for the "Type" key in the dictionary
        Returns:
        whether the dictionary is of the expected type
        Throws:
        java.io.IOException
      • setObjectId

        public void setObjectId​(int objNum,
                                int objGen)
        Set the object identifiers
        Parameters:
        objNum - the object number
        objGen - the object generation number
      • getObjNum

        public int getObjNum()
        Get the object number of this object; a negative value indicates that the object is not numbered, as it's not a top-level object: if the value is OBJ_NUM_EMBEDDED, it is because it's embedded within another object. If the value is OBJ_NUM_TRAILER, it's because it's an object from the trailer.
        Returns:
        the object number, if positive
      • getObjGen

        public int getObjGen()
        Get the object generation number of this object; a negative value indicates that the object is not numbered, as it's not a top-level object: if the value is OBJ_NUM_EMBEDDED, it is because it's embedded within another object. If the value is OBJ_NUM_TRAILER, it's because it's an object from the trailer.
        Returns:
        the object generation number, if positive
      • toString

        public java.lang.String toString()
        return a representation of this PDFObject as a String. Does NOT dereference anything: this is the only method that allows you to distinguish an INDIRECT PDFObject.
        Overrides:
        toString in class java.lang.Object
      • dereference

        public PDFObject dereference()
                              throws java.io.IOException
        Make sure that this object is dereferenced. Use the cache of an indirect object to cache the dereferenced value, if possible.
        Throws:
        java.io.IOException
      • isIndirect

        public boolean isIndirect()
        Identify whether the object is currently an indirect/cross-reference
        Returns:
        whether currently indirect
      • equals

        public boolean equals​(java.lang.Object o)
        Test whether two PDFObject are equal. Objects are equal IFF they are the same reference OR they are both indirect objects with the same id and generation number in their xref
        Overrides:
        equals in class java.lang.Object