Package com.lowagie.text.pdf.parser
Class Word
java.lang.Object
com.lowagie.text.pdf.parser.ParsedTextImpl
com.lowagie.text.pdf.parser.Word
- All Implemented Interfaces:
TextAssemblyBuffer
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final boolean
If this word or fragment was preceded by a space, or a line break, it should never be merged into a preceding word.private final boolean
Is this an indivisible fragment, because it contained a space or was split from a space- containing string. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
accumulate
(TextAssembler p, String contextName) accept a visitor that is assembling textvoid
Accept a visitor that is assembling textboolean
private static String
escapeHTML
(String s) private static String
formatPercent
(float f) getFinalText
(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup) boolean
toString()
private String
wordMarkup
(String text, PdfReader reader, int page, TextAssembler assembler) Generate markup for this word.Methods inherited from class com.lowagie.text.pdf.parser.ParsedTextImpl
getAscent, getBaseline, getDescent, getEndPoint, getSingleSpaceWidth, getStartPoint, getText, getWidth
-
Field Details
-
shouldNotSplit
private final boolean shouldNotSplitIs this an indivisible fragment, because it contained a space or was split from a space- containing string. Non-splittable words can be merged (into new non-splittable words). -
breakBefore
private final boolean breakBeforeIf this word or fragment was preceded by a space, or a line break, it should never be merged into a preceding word.
-
-
Constructor Details
-
Word
Word(String text, float ascent, float descent, Vector startPoint, Vector endPoint, Vector baseline, float spaceWidth, boolean isCompleteWord, boolean breakBefore) - Parameters:
text
- text contentascent
- font ascent (e.g. height)descent
- How far below the baseline letters gostartPoint
- first point of the textendPoint
- ending offset of textbaseline
- line along which text is set.spaceWidth
- how much space is a space supposed to take.isCompleteWord
- word should never be splitbreakBefore
- word starts here, should never combine to the left.
-
-
Method Details
-
formatPercent
-
escapeHTML
-
accumulate
accept a visitor that is assembling text- Parameters:
p
- the assembler that is visiting us.contextName
- What is the wrapping markup element name if any- See Also:
-
assemble
Accept a visitor that is assembling text- Parameters:
p
- the assembler that is visiting us.- See Also:
-
wordMarkup
Generate markup for this word. send the assembler a strings representing a CSS style that will format us nicely.- Parameters:
text
- passed in because we may have wanted to alter it, e.g. by trimming white space, or filtering characters or something.reader
- the file reader from which we are extractingpage
- number of the page we are reading text fromassembler
- object to assemble text from fragments and larger strings on a page.- Returns:
- markup to represent this one word.
-
getFinalText
public FinalText getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup) - Parameters:
reader
- pdfReader that knows about our document. (size, etc. available here).page
- which page are we extracting text from.assembler
- Builds result by accepting content from text components of various sorts.useMarkup
- Should we generate tagged text, or just plain text.- Returns:
- the final text ready to concatenate into result string.
- See Also:
-
toString
-
shouldNotSplit
public boolean shouldNotSplit()- Specified by:
shouldNotSplit
in classParsedTextImpl
- Returns:
- true if this was extracted from a string containing spaces, in which case, we assume further splitting is not needed.
- See Also:
-
breakBefore
public boolean breakBefore()- Specified by:
breakBefore
in classParsedTextImpl
- Returns:
- true if this was a space or other item that should force a space before it.
- See Also:
-