Class HTMLNamedEntitiesParser.State

java.lang.Object
org.htmlunit.cyberneko.HTMLNamedEntitiesParser.State
Direct Known Subclasses:
HTMLNamedEntitiesParser.RootState
Enclosing class:
HTMLNamedEntitiesParser

public static class HTMLNamedEntitiesParser.State extends Object
Our "level" in the treeish structure that keeps its static state and the next level underneath.
  • Field Details

    • depth_

      private final int depth_
    • characters_

      int[] characters_
    • nextState_

    • entityOrFragment_

      public final String entityOrFragment_
    • resolvedValue_

      public String resolvedValue_
    • length_

      public final int length_
    • endsWithSemicolon_

      public final boolean endsWithSemicolon_
    • isMatch_

      public boolean isMatch_
    • endNode_

      public boolean endNode_
  • Constructor Details

    • State

      protected State()
      Create the empty state
    • State

      protected State(int depth, String entityFragment, String resolvedValue)
      Create us a new state that describes itself nicely
  • Method Details

    • updateNonSemicolonEntity

      protected void updateNonSemicolonEntity(String entity, String resolvedValue)
      We have a special in between state because some entities exist as correct entity with a semicolon at the end and as legacy version without. We want to look up both correctly, hence when we build the data set, we have to unmark an existing one as final one and insert one more.
      Parameters:
      entity - the entity to look up
      resolvedValue - the value it will resolve to
    • add

      protected void add(String entity, String resolvedValue)
      Add a new entity to the pseudo-tree
      Parameters:
      entity - the entity to look for later
      resolvedValue - the value it resolves to
    • lookup

      protected HTMLNamedEntitiesParser.State lookup(int character)
      Lookup the state by iterating over the chars at this state, should not be that many and due to the small size of the array, should be cache only
      Parameters:
      character - the char to look up
      Returns:
      the next state or the same in case the character was not found