Class CompareTool


  • public class CompareTool
    extends java.lang.Object
    This class provides means to compare two PDF files both by content and visually and gives the report on their differences.

    For visual comparison it uses external tools: Ghostscript and ImageMagick, which should be installed on your machine. To allow CompareTool to use them, you need to pass either java properties or environment variables with names "ITEXT_GS_EXEC" and "ITEXT_MAGICK_COMPARE_EXEC", which would contain the commands to execute the Ghostscript and ImageMagick tools.

    CompareTool class was mainly designed for the testing purposes of iText in order to ensure that the same code produces the same PDF document. For this reason you will often encounter such parameter names as "outDoc" and "cmpDoc" which stand for output document and document-for-comparison. The first one is viewed as the current result, and the second one is referred as normal or ideal result. OutDoc is compared to the ideal cmpDoc. Therefore all reports of the comparison are in the form: "Expected ..., but was ...". This should be interpreted in the following way: "expected" part stands for the content of the cmpDoc and "but was" part stands for the content of the outDoc.

    • Field Detail

      • UNEXPECTED_NUMBER_OF_PAGES

        private static final java.lang.String UNEXPECTED_NUMBER_OF_PAGES
        See Also:
        Constant Field Values
      • IGNORED_AREAS_PREFIX

        private static final java.lang.String IGNORED_AREAS_PREFIX
        See Also:
        Constant Field Values
      • VERSION_REPLACEMENT

        private static final java.lang.String VERSION_REPLACEMENT
        See Also:
        Constant Field Values
      • COPYRIGHT_REGEXP

        private static final java.lang.String COPYRIGHT_REGEXP
        See Also:
        Constant Field Values
      • COPYRIGHT_REPLACEMENT

        private static final java.lang.String COPYRIGHT_REPLACEMENT
        See Also:
        Constant Field Values
      • cmpPdfName

        private java.lang.String cmpPdfName
      • outPdfName

        private java.lang.String outPdfName
      • cmpPdf

        private java.lang.String cmpPdf
      • cmpImage

        private java.lang.String cmpImage
      • outPdf

        private java.lang.String outPdf
      • outImage

        private java.lang.String outImage
      • compareByContentErrorsLimit

        private int compareByContentErrorsLimit
      • generateCompareByContentXmlReport

        private boolean generateCompareByContentXmlReport
      • encryptionCompareEnabled

        private boolean encryptionCompareEnabled
      • kdfSaltCompareEnabled

        private boolean kdfSaltCompareEnabled
      • useCachedPagesForComparison

        private boolean useCachedPagesForComparison
      • gsExec

        private java.lang.String gsExec
      • compareExec

        private java.lang.String compareExec
    • Constructor Detail

      • CompareTool

        public CompareTool()
        Create new CompareTool instance.
      • CompareTool

        CompareTool​(java.lang.String gsExec,
                    java.lang.String compareExec)
    • Method Detail

      • createTestPdfWriter

        public static PdfWriter createTestPdfWriter​(java.lang.String filename)
                                             throws java.io.IOException
        Create PdfWriter optimized for tests.
        Parameters:
        filename - File to write to when necessary.
        Returns:
        PdfWriter to be used in tests.
        Throws:
        java.io.FileNotFoundException - if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason.
        java.io.IOException
      • createTestPdfWriter

        public static PdfWriter createTestPdfWriter​(java.lang.String filename,
                                                    WriterProperties properties)
                                             throws java.io.IOException
        Create PdfWriter optimized for tests.
        Parameters:
        filename - File to write to when necessary.
        properties - WriterProperties to use.
        Returns:
        PdfWriter to be used in tests.
        Throws:
        java.io.FileNotFoundException - if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason.
        java.io.IOException
      • createOutputReader

        public static PdfReader createOutputReader​(java.lang.String filename)
                                            throws java.io.IOException
        Create PdfReader out of the data created recently or read from disk.
        Parameters:
        filename - File to read the data from when necessary.
        Returns:
        PdfReader to be used in tests.
        Throws:
        java.io.IOException - on error
      • createOutputReader

        public static PdfReader createOutputReader​(java.lang.String filename,
                                                   ReaderProperties properties)
                                            throws java.io.IOException
        Create PdfReader out of the data created recently or read from disk.
        Parameters:
        filename - File to read the data from when necessary.
        properties - ReaderProperties to use.
        Returns:
        PdfReader to be used in tests.
        Throws:
        java.io.IOException - on error
      • cleanup

        public static void cleanup​(java.lang.String path)
        Clean up memory occupied for the tests.
        Parameters:
        path - Path to clean up memory for.
      • compareByCatalog

        public CompareTool.CompareResult compareByCatalog​(PdfDocument outDocument,
                                                          PdfDocument cmpDocument)
        Compares two PDF documents by content starting from Catalog dictionary and then recursively comparing corresponding objects which are referenced from it. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        The main difference between this method and the compareByContent(String, String, String, String) methods is the return value. This method returns a CompareTool.CompareResult class instance, which could be used in code, whilst compareByContent methods in case of the differences simply return String value, which could only be printed. Also, keep in mind that this method doesn't perform visual comparison of the documents.

        For more explanations about what outDoc and cmpDoc are see last paragraph of the CompareTool class description.

        Parameters:
        outDocument - a PdfDocument corresponding to the output file, which is to be compared with cmp-file.
        cmpDocument - a PdfDocument corresponding to the cmp-file, which is to be compared with output file.
        Returns:
        the report on comparison of two files in the form of the custom class CompareTool.CompareResult instance.
        See Also:
        CompareTool.CompareResult
      • disableCachedPagesComparison

        public CompareTool disableCachedPagesComparison()
        Disables the default logic of pages comparison. This option makes sense only for compareByCatalog(PdfDocument, PdfDocument) method.

        By default, pages are treated as special objects and if they are met in the process of comparison, then they are not checked as objects, but rather simply checked that they have same page numbers in both documents. This behaviour is intended for the compareByContent(java.lang.String, java.lang.String, java.lang.String) set of methods, because in them documents are compared in page by page basis. Thus, we don't need to check if pages are of the same content when they are met in comparison process, we are sure that we will compare their content or we have already compared them.

        However, if you would use compareByCatalog(com.itextpdf.kernel.pdf.PdfDocument, com.itextpdf.kernel.pdf.PdfDocument) with default behaviour of pages comparison, pages won't be checked at all, every time when reference to the page dictionary is met, only page numbers will be compared for both documents. You can say that in this case, comparison will be performed for all document's catalog entries except /Pages (However in fact, document's page tree structures will be compared, but pages themselves - won't).

        Returns:
        this CompareTool instance.
      • setCompareByContentErrorsLimit

        public CompareTool setCompareByContentErrorsLimit​(int compareByContentMaxErrorCount)
        Sets the maximum errors count which will be returned as the result of the comparison.
        Parameters:
        compareByContentMaxErrorCount - the errors count.
        Returns:
        this CompareTool instance.
      • setGenerateCompareByContentXmlReport

        public CompareTool setGenerateCompareByContentXmlReport​(boolean generateCompareByContentXmlReport)
        Enables or disables the generation of the comparison report in the form of an xml document.

        IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods!

        Parameters:
        generateCompareByContentXmlReport - true to enable xml report generation, false - to disable.
        Returns:
        this CompareTool instance.
      • setEventCountingMetaInfo

        public void setEventCountingMetaInfo​(IMetaInfo metaInfo)
        Sets IMetaInfo info that will be used for both read and written documents creation.
        Parameters:
        metaInfo - meta info to set
      • enableEncryptionCompare

        public CompareTool enableEncryptionCompare()
        Enables the comparison of the encryption properties of the documents. Encryption properties comparison results are returned along with all other comparison results.

        IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods! compareByCatalog(PdfDocument, PdfDocument) doesn't compare encryption properties because encryption properties aren't part of the document's Catalog.

        Returns:
        this CompareTool instance.
      • enableEncryptionCompare

        public CompareTool enableEncryptionCompare​(boolean kdfSaltCompareEnabled)
        Enables the comparison of the encryption properties of the documents. Encryption properties comparison results are returned along with all other comparison results.

        IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods! compareByCatalog(PdfDocument, PdfDocument) doesn't compare encryption properties because encryption properties aren't part of the document's Catalog.

        Parameters:
        kdfSaltCompareEnabled - set to true if PdfName.KDFSalt entry must be compared, {code false} otherwise
        Returns:
        this CompareTool instance.
      • getOutReaderProperties

        public ReaderProperties getOutReaderProperties()
        Gets ReaderProperties to be passed later to the PdfReader of the output document.

        Documents for comparison are opened in reader mode. This method is intended to alter ReaderProperties which are used to open the output document. This is particularly useful for comparison of encrypted documents.

        For more explanations about what outDoc and cmpDoc are see last paragraph of the CompareTool class description.

        Returns:
        ReaderProperties instance to be passed later to the PdfReader of the output document.
      • getCmpReaderProperties

        public ReaderProperties getCmpReaderProperties()
        Gets ReaderProperties to be passed later to the PdfReader of the cmp document.

        Documents for comparison are opened in reader mode. This method is intended to alter ReaderProperties which are used to open the cmp document. This is particularly useful for comparison of encrypted documents.

        For more explanations about what outDoc and cmpDoc are see last paragraph of the CompareTool class description.

        Returns:
        ReaderProperties instance to be passed later to the PdfReader of the cmp document.
      • compareVisually

        public java.lang.String compareVisually​(java.lang.String outPdf,
                                                java.lang.String cmpPdf,
                                                java.lang.String outPath,
                                                java.lang.String differenceImagePrefix)
                                         throws java.lang.InterruptedException,
                                                java.io.IOException
        Compares two documents visually. For the comparison two external tools are used: Ghostscript and ImageMagick. For more info about needed configuration for visual comparison process see CompareTool class description.

        Note, that this method uses ImageMagickHelper and GhostscriptHelper classes and therefore may create temporary files and directories.

        During comparison for every page of the two documents an image file will be created in the folder specified by outPath parameter. Then those page images will be compared and if there are any differences for some pages, another image file will be created with marked differences on it.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        differenceImagePrefix - file name prefix for image files with marked differences if there is any.
        Returns:
        string containing list of the pages that are visually different, or null if there are no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
      • compareVisually

        public java.lang.String compareVisually​(java.lang.String outPdf,
                                                java.lang.String cmpPdf,
                                                java.lang.String outPath,
                                                java.lang.String differenceImagePrefix,
                                                java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas)
                                         throws java.lang.InterruptedException,
                                                java.io.IOException
        Compares two documents visually. For the comparison two external tools are used: Ghostscript and ImageMagick. For more info about needed configuration for visual comparison process see CompareTool class description.

        Note, that this method uses ImageMagickHelper and GhostscriptHelper classes and therefore may create temporary files and directories.

        During comparison for every page of two documents an image file will be created in the folder specified by outPath parameter. Then those page images will be compared and if there are any differences for some pages, another image file will be created with marked differences on it.

        It is possible to ignore certain areas of the document pages during visual comparison. This is useful for example in case if documents should be the same except certain page area with date on it. In this case, in the folder specified by the outPath, new pdf documents will be created with the black rectangles at the specified ignored areas, and visual comparison will be performed on these new documents.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        differenceImagePrefix - file name prefix for image files with marked differences if there is any.
        ignoredAreas - a map with one-based page numbers as keys and lists of ignored rectangles as values.
        Returns:
        string containing list of the pages that are visually different, or null if there are no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
      • compareByContent

        public java.lang.String compareByContent​(java.lang.String outPdf,
                                                 java.lang.String cmpPdf,
                                                 java.lang.String outPath)
                                          throws java.lang.InterruptedException,
                                                 java.io.IOException
        Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        When comparison by content is finished, if any differences were found, visual comparison is automatically started. For this overload, differenceImagePrefix value is generated using diff_%outPdfFileName%_ format.

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        Returns:
        string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
        See Also:
        compareVisually(String, String, String, String)
      • compareByContent

        public java.lang.String compareByContent​(java.lang.String outPdf,
                                                 java.lang.String cmpPdf,
                                                 java.lang.String outPath,
                                                 java.lang.String differenceImagePrefix)
                                          throws java.lang.InterruptedException,
                                                 java.io.IOException
        Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        When comparison by content is finished, if any differences were found, visual comparison is automatically started.

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        differenceImagePrefix - file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
        Returns:
        string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
        See Also:
        compareVisually(String, String, String, String)
      • compareByContent

        public java.lang.String compareByContent​(java.lang.String outPdf,
                                                 java.lang.String cmpPdf,
                                                 java.lang.String outPath,
                                                 java.lang.String differenceImagePrefix,
                                                 byte[] outPass,
                                                 byte[] cmpPass)
                                          throws java.lang.InterruptedException,
                                                 java.io.IOException
        This method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.

        Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        When comparison by content is finished, if any differences were found, visual comparison is automatically started. For more info see compareVisually(String, String, String, String).

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        differenceImagePrefix - file name prefix for image files with marked visual differences if there is any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
        outPass - password for the encrypted document specified by the outPdf absolute path.
        cmpPass - password for the encrypted document specified by the cmpPdf absolute path.
        Returns:
        string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
        See Also:
        compareVisually(String, String, String, String)
      • compareByContent

        public java.lang.String compareByContent​(java.lang.String outPdf,
                                                 java.lang.String cmpPdf,
                                                 java.lang.String outPath,
                                                 java.lang.String differenceImagePrefix,
                                                 java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas)
                                          throws java.lang.InterruptedException,
                                                 java.io.IOException
        Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        When comparison by content is finished, if any differences were found, visual comparison is automatically started.

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        differenceImagePrefix - file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
        ignoredAreas - a map with one-based page numbers as keys and lists of ignored rectangles as values.
        Returns:
        string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
        See Also:
        compareVisually(String, String, String, String)
      • compareByContent

        public java.lang.String compareByContent​(java.lang.String outPdf,
                                                 java.lang.String cmpPdf,
                                                 java.lang.String outPath,
                                                 java.lang.String differenceImagePrefix,
                                                 java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas,
                                                 byte[] outPass,
                                                 byte[] cmpPass)
                                          throws java.lang.InterruptedException,
                                                 java.io.IOException
        This method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.

        Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        When comparison by content is finished, if any differences were found, visual comparison is automatically started.

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outPdf - the absolute path to the output file, which is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which is to be compared to output file.
        outPath - the absolute path to the folder, which will be used to store image files for visual comparison.
        differenceImagePrefix - file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
        ignoredAreas - a map with one-based page numbers as keys and lists of ignored rectangles as values.
        outPass - password for the encrypted document specified by the outPdf absolute path.
        cmpPass - password for the encrypted document specified by the cmpPdf absolute path.
        Returns:
        string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
        Throws:
        java.lang.InterruptedException - if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException is thrown.
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
        See Also:
        compareVisually(String, String, String, String)
      • compareDictionaries

        public boolean compareDictionaries​(PdfDictionary outDict,
                                           PdfDictionary cmpDict)
        Simple method that compares two given PdfDictionaries by content. This is "deep" comparing, which means that all nested objects are also compared by content.
        Parameters:
        outDict - dictionary to compare.
        cmpDict - dictionary to compare.
        Returns:
        true if dictionaries are equal by content, otherwise false.
      • compareDictionariesStructure

        public CompareTool.CompareResult compareDictionariesStructure​(PdfDictionary outDict,
                                                                      PdfDictionary cmpDict)
        Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        Both out and cmp PdfDictionary shall have indirect references.

        By default page dictionaries are excluded from the comparison when met and are instead compared in a special manner, simply comparing their page numbers. This behavior can be disabled by calling disableCachedPagesComparison().

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outDict - an indirect PdfDictionary from the output file, which is to be compared to cmp-file dictionary.
        cmpDict - an indirect PdfDictionary from the cmp-file file, which is to be compared to output file dictionary.
        Returns:
        CompareTool.CompareResult instance containing differences between the two dictionaries, or null if dictionaries are equal.
      • compareDictionariesStructure

        public CompareTool.CompareResult compareDictionariesStructure​(PdfDictionary outDict,
                                                                      PdfDictionary cmpDict,
                                                                      java.util.Set<PdfName> excludedKeys)
        Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        Both out and cmp PdfDictionary shall have indirect references.

        By default page dictionaries are excluded from the comparison when met and are instead compared in a special manner, simply comparing their page numbers. This behavior can be disabled by calling disableCachedPagesComparison().

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outDict - an indirect PdfDictionary from the output file, which is to be compared to cmp-file dictionary.
        cmpDict - an indirect PdfDictionary from the cmp-file file, which is to be compared to output file dictionary.
        excludedKeys - a Set of names that designate entries from outDict and cmpDict dictionaries which are to be skipped during comparison.
        Returns:
        CompareTool.CompareResult instance containing differences between the two dictionaries, or null if dictionaries are equal.
      • compareStreamsStructure

        public CompareTool.CompareResult compareStreamsStructure​(PdfStream outStream,
                                                                 PdfStream cmpStream)
        Compares structures of two corresponding streams from out and cmp PDF documents. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.

        For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool class description.

        Parameters:
        outStream - a PdfStream from the output file, which is to be compared to cmp-file stream.
        cmpStream - a PdfStream from the cmp-file file, which is to be compared to output file stream.
        Returns:
        CompareTool.CompareResult instance containing differences between the two streams, or null if streams are equal.
      • compareStreams

        public boolean compareStreams​(PdfStream outStream,
                                      PdfStream cmpStream)
        Simple method that compares two given PdfStreams by content. This is "deep" comparing, which means that all nested objects are also compared by content.
        Parameters:
        outStream - stream to compare.
        cmpStream - stream to compare.
        Returns:
        true if stream are equal by content, otherwise false.
      • compareArrays

        public boolean compareArrays​(PdfArray outArray,
                                     PdfArray cmpArray)
        Simple method that compares two given PdfArrays by content. This is "deep" comparing, which means that all nested objects are also compared by content.
        Parameters:
        outArray - array to compare.
        cmpArray - array to compare.
        Returns:
        true if arrays are equal by content, otherwise false.
      • compareNames

        public boolean compareNames​(PdfName outName,
                                    PdfName cmpName)
        Simple method that compares two given PdfNames.
        Parameters:
        outName - name to compare.
        cmpName - name to compare.
        Returns:
        true if names are equal, otherwise false.
      • compareNumbers

        public boolean compareNumbers​(PdfNumber outNumber,
                                      PdfNumber cmpNumber)
        Simple method that compares two given PdfNumbers.
        Parameters:
        outNumber - number to compare.
        cmpNumber - number to compare.
        Returns:
        true if numbers are equal, otherwise false.
      • compareStrings

        public boolean compareStrings​(PdfString outString,
                                      PdfString cmpString)
        Simple method that compares two given PdfStrings.
        Parameters:
        outString - string to compare.
        cmpString - string to compare.
        Returns:
        true if strings are equal, otherwise false.
      • compareBooleans

        public boolean compareBooleans​(PdfBoolean outBoolean,
                                       PdfBoolean cmpBoolean)
        Simple method that compares two given PdfBooleans.
        Parameters:
        outBoolean - boolean to compare.
        cmpBoolean - boolean to compare.
        Returns:
        true if booleans are equal, otherwise false.
      • compareXmp

        public java.lang.String compareXmp​(java.lang.String outPdf,
                                           java.lang.String cmpPdf)
        Compares xmp metadata of the two given PDF documents.
        Parameters:
        outPdf - the absolute path to the output file, which xmp is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which xmp is to be compared to output file.
        Returns:
        text report on the xmp differences, or null if there are no differences.
      • compareXmp

        public java.lang.String compareXmp​(java.lang.String outPdf,
                                           java.lang.String cmpPdf,
                                           boolean ignoreDateAndProducerProperties)
        Compares xmp metadata of the two given PDF documents.
        Parameters:
        outPdf - the absolute path to the output file, which xmp is to be compared to cmp-file.
        cmpPdf - the absolute path to the cmp-file, which xmp is to be compared to output file.
        ignoreDateAndProducerProperties - true, if to ignore differences in date or producer xmp metadata properties.
        Returns:
        text report on the xmp differences, or null if there are no differences.
      • compareXmls

        public boolean compareXmls​(byte[] xml1,
                                   byte[] xml2)
                            throws javax.xml.parsers.ParserConfigurationException,
                                   org.xml.sax.SAXException,
                                   java.io.IOException
        Utility method that provides simple comparison of the two xml files stored in byte arrays.
        Parameters:
        xml1 - first xml file data to compare.
        xml2 - second xml file data to compare.
        Returns:
        true if xml structures are identical, false otherwise.
        Throws:
        javax.xml.parsers.ParserConfigurationException - if a XML DocumentBuilder cannot be created which satisfies the configuration requested.
        org.xml.sax.SAXException - if any XML parse errors occur.
        java.io.IOException - If any IO errors occur during reading XML files.
      • compareXmls

        public boolean compareXmls​(java.lang.String outXmlFile,
                                   java.lang.String cmpXmlFile)
                            throws javax.xml.parsers.ParserConfigurationException,
                                   org.xml.sax.SAXException,
                                   java.io.IOException
        Utility method that provides simple comparison of the two xml files.
        Parameters:
        outXmlFile - absolute path to the out xml file to compare.
        cmpXmlFile - absolute path to the cmp xml file to compare.
        Returns:
        true if xml structures are identical, false otherwise.
        Throws:
        javax.xml.parsers.ParserConfigurationException - if a XML DocumentBuilder cannot be created which satisfies the configuration requested.
        org.xml.sax.SAXException - if any XML parse errors occur.
        java.io.IOException - If any IO errors occur during reading XML files.
      • compareDocumentInfo

        public java.lang.String compareDocumentInfo​(java.lang.String outPdf,
                                                    java.lang.String cmpPdf,
                                                    byte[] outPass,
                                                    byte[] cmpPass)
                                             throws java.io.IOException
        Compares document info dictionaries of two pdf documents.

        This method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.

        Parameters:
        outPdf - the absolute path to the output file, which info is to be compared to cmp-file info.
        cmpPdf - the absolute path to the cmp-file, which info is to be compared to output file info.
        outPass - password for the encrypted document specified by the outPdf absolute path.
        cmpPass - password for the encrypted document specified by the cmpPdf absolute path.
        Returns:
        text report on the differences in documents infos.
        Throws:
        java.io.IOException - if PDF reader cannot be created due to IO issues
      • compareDocumentInfo

        public java.lang.String compareDocumentInfo​(java.lang.String outPdf,
                                                    java.lang.String cmpPdf)
                                             throws java.io.IOException
        Compares document info dictionaries of two pdf documents.
        Parameters:
        outPdf - the absolute path to the output file, which info is to be compared to cmp-file info.
        cmpPdf - the absolute path to the cmp-file, which info is to be compared to output file info.
        Returns:
        text report on the differences in documents infos.
        Throws:
        java.io.IOException - if PDF reader cannot be created due to IO issues
      • compareLinkAnnotations

        public java.lang.String compareLinkAnnotations​(java.lang.String outPdf,
                                                       java.lang.String cmpPdf)
                                                throws java.io.IOException
        Checks if two documents have identical link annotations on corresponding pages.
        Parameters:
        outPdf - the absolute path to the output file, which links are to be compared to cmp-file links.
        cmpPdf - the absolute path to the cmp-file, which links are to be compared to output file links.
        Returns:
        text report on the differences in documents links.
        Throws:
        java.io.IOException - if PDF reader cannot be created due to IO issues
      • compareTagStructures

        public java.lang.String compareTagStructures​(java.lang.String outPdf,
                                                     java.lang.String cmpPdf)
                                              throws java.io.IOException,
                                                     javax.xml.parsers.ParserConfigurationException,
                                                     org.xml.sax.SAXException
        Compares tag structures of the two PDF documents.

        This method creates xml files in the same folder with outPdf file. These xml files contain documents tag structures converted into the xml structure. These xml files are compared if they are equal.

        Parameters:
        outPdf - the absolute path to the output file, which tags are to be compared to cmp-file tags.
        cmpPdf - the absolute path to the cmp-file, which tags are to be compared to output file tags.
        Returns:
        text report of the differences in documents tags.
        Throws:
        java.io.IOException - is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
        javax.xml.parsers.ParserConfigurationException - if a XML DocumentBuilder cannot be created which satisfies the configuration requested.
        org.xml.sax.SAXException - if any XML parse errors occur.
      • convertDocInfoToStrings

        protected java.lang.String[] convertDocInfoToStrings​(PdfDocumentInfo info)
        Converts document info into a string array.

        Converts document info into a string array. It can be used to compare PdfDocumentInfo later on. Default implementation retrieves title, author, subject, keywords and producer.

        Parameters:
        info - an instance of PdfDocumentInfo to be converted.
        Returns:
        String array with all the document info tester is interested in.
      • convertProducerLine

        java.lang.String convertProducerLine​(java.lang.String producer)
      • init

        private void init​(java.lang.String outPdf,
                          java.lang.String cmpPdf)
      • setPassword

        private void setPassword​(byte[] outPass,
                                 byte[] cmpPass)
      • compareVisually

        private java.lang.String compareVisually​(java.lang.String outPath,
                                                 java.lang.String differenceImagePrefix,
                                                 java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas)
                                          throws java.lang.InterruptedException,
                                                 java.io.IOException
        Throws:
        java.lang.InterruptedException
        java.io.IOException
      • compareVisually

        private java.lang.String compareVisually​(java.lang.String outPath,
                                                 java.lang.String differenceImagePrefix,
                                                 java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas,
                                                 java.util.List<java.lang.Integer> equalPages)
                                          throws java.io.IOException,
                                                 java.lang.InterruptedException
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • compareImagesOfPdfs

        private java.lang.String compareImagesOfPdfs​(java.lang.String outPath,
                                                     java.lang.String differenceImagePrefix,
                                                     java.util.List<java.lang.Integer> equalPages)
                                              throws java.io.IOException,
                                                     java.lang.InterruptedException
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • listDiffPagesAsString

        private java.lang.String listDiffPagesAsString​(java.util.List<java.lang.Integer> diffPages)
      • createIgnoredAreasPdfs

        private void createIgnoredAreasPdfs​(java.lang.String outPath,
                                            java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas)
                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • prepareOutputDirs

        private void prepareOutputDirs​(java.lang.String outPath,
                                       java.lang.String differenceImagePrefix)
      • printOutCmpDirectories

        private void printOutCmpDirectories()
      • compareByContent

        private java.lang.String compareByContent​(java.lang.String outPath,
                                                  java.lang.String differenceImagePrefix,
                                                  java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas)
                                           throws java.lang.InterruptedException,
                                                  java.io.IOException
        Throws:
        java.lang.InterruptedException
        java.io.IOException
      • writeOnDisk

        private static void writeOnDisk​(java.lang.String filename)
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • writeOnDiskIfNotExists

        private static void writeOnDiskIfNotExists​(java.lang.String filename)
                                            throws java.io.IOException
        Throws:
        java.io.IOException
      • compareVisuallyAndCombineReports

        private java.lang.String compareVisuallyAndCombineReports​(java.lang.String compareByFailContentReason,
                                                                  java.lang.String outPath,
                                                                  java.lang.String differenceImagePrefix,
                                                                  java.util.Map<java.lang.Integer,​java.util.List<Rectangle>> ignoredAreas,
                                                                  java.util.List<java.lang.Integer> equalPages)
                                                           throws java.io.IOException,
                                                                  java.lang.InterruptedException
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • compareStreams

        private boolean compareStreams​(java.io.InputStream is1,
                                       java.io.InputStream is2)
                                throws java.io.IOException
        Throws:
        java.io.IOException
      • compareObjects

        protected boolean compareObjects​(PdfObject outObj,
                                         PdfObject cmpObj,
                                         ObjectPath currentPath,
                                         CompareTool.CompareResult compareResult)
        Compare PDF objects.
        Parameters:
        outObj - out object corresponding to the output file, which is to be compared with cmp object
        cmpObj - cmp object corresponding to the cmp-file, which is to be compared with out object
        currentPath - current objects ObjectPath path
        compareResult - CompareTool.CompareResult for the results of the comparison of the two documents
        Returns:
        true if objects are equal, false otherwise.
      • findBytesDifference

        private int findBytesDifference​(byte[] outStreamBytes,
                                        byte[] cmpStreamBytes,
                                        java.lang.StringBuilder errorMessage)
        Returns:
        first difference offset
      • findStringDifference

        private int findStringDifference​(java.lang.String outString,
                                         java.lang.String cmpString,
                                         java.lang.StringBuilder errorMessage)
      • convertPdfStringToBytes

        private byte[] convertPdfStringToBytes​(PdfString pdfString)
      • getExplicitDestinationPageNum

        private int getExplicitDestinationPageNum​(PdfArray explicitDest)