Package com.itextpdf.kernel.utils
Class TaggedPdfReaderTool
java.lang.Object
com.itextpdf.kernel.utils.TaggedPdfReaderTool
Converts a tagged PDF document into an XML file.
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected PdfDocument
protected OutputStreamWriter
protected Map
<PdfDictionary, Map<Integer, String>> protected String
-
Constructor Summary
ConstructorsConstructorDescriptionTaggedPdfReaderTool
(PdfDocument document) Constructs aTaggedPdfReaderTool
via a givenPdfDocument
. -
Method Summary
Modifier and TypeMethodDescriptionvoid
Converts the current tag structure into an XML file with default encoding (UTF-8).void
convertToXml
(OutputStream os, String charset) Converts the current tag structure into an XML file with provided encoding.protected static String
NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.protected static String
fixTagName
(String tag) protected void
protected void
inspectKid
(IStructureNode kid) protected void
inspectKids
(List<IStructureNode> kids) static boolean
isValidCharacterValue
(int c) Checks if a character value should be escaped/unescaped.protected void
setRootTag
(String rootTagName) Sets the name of the root tag of the resultant XML file
-
Field Details
-
document
-
out
-
rootTag
-
parsedTags
-
-
Constructor Details
-
TaggedPdfReaderTool
Constructs aTaggedPdfReaderTool
via a givenPdfDocument
.- Parameters:
document
- the document to read tag structure from
-
-
Method Details
-
isValidCharacterValue
public static boolean isValidCharacterValue(int c) Checks if a character value should be escaped/unescaped.- Parameters:
c
- a character value- Returns:
- true if it's OK to escape or unescape this value
-
convertToXml
Converts the current tag structure into an XML file with default encoding (UTF-8).- Parameters:
os
- the output stream to save XML file to- Throws:
IOException
- in case of any I/O error
-
convertToXml
Converts the current tag structure into an XML file with provided encoding.- Parameters:
os
- the output stream to save XML file tocharset
- the charset of the resultant XML file- Throws:
IOException
- in case of any I/O error
-
setRootTag
Sets the name of the root tag of the resultant XML file- Parameters:
rootTagName
- the name of the root tag- Returns:
- this object
-
inspectKids
-
inspectKid
-
inspectAttributes
-
parseTag
-
fixTagName
-
escapeXML
NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.- Parameters:
s
- the string to be escapedonlyASCII
- codes above 127 will always be escaped with &#nn; iftrue
- Returns:
- the escaped string
-