Interface TextAssembler

All Known Implementing Classes:
MarkedUpTextAssembler

public interface TextAssembler
process a series of objects and text fragments, assembling them into a one final text object representing the whole content.
  • Method Details

    • process

      void process(FinalText completed, String contextName)
      Parameters:
      completed - process a complete chunk -- just add this subsection into the proper place.
      contextName - Name of the element context we are in. Null value if it's an Artifact.
    • process

      void process(Word completed, String contextName)
      Parameters:
      completed - process a complete chunk -- just add this subsection into the proper place.
      contextName - Name of the element context we are in. Null value if it's an Artifact.
    • process

      void process(ParsedText parsed, String contextName)
      Parameters:
      parsed - process one of a number of raw pdf text chunks, with placement, font, etc.
      contextName - Name of the element context we are in. Null value if it's an Artifact.
    • renderText

      void renderText(FinalText completed)
      Parameters:
      completed - process a complete chunk -- just add this subsection into the proper place.
    • renderText

      void renderText(ParsedTextImpl parsed)
      Parameters:
      parsed - process one of a number of raw pdf text chunks, with placement, font, etc.
    • endParsingContext

      FinalText endParsingContext(String containingElementName)
      Parameters:
      containingElementName - This is an element name to surround the extracted text
      Returns:
      the final text for the set of fragments and fully parsed items we were passed during processing.
    • getWordId

      String getWordId()
      assembler can calculate an identifier for each word on a page, for use in markup.
      Returns:
      the new unique id.
    • setPage

      void setPage(int page)
      Parameters:
      page - number of the page we are assembling
    • reset

      void reset()