Class XPather

java.lang.Object
org.htmlcleaner.XPather

public class XPather extends Object

Utility for searching cleaned document tree with XPath expressions.

Examples of supported axes:
  • //div//a
  • //div//a[@id][@class]
  • /body/*[1]/@type
  • //div[3]//a[@id][@href='r/n4']
  • //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
  • //div[2]/@*[2]
  • data(//div//a[@id][@class])
  • //p/last()
  • //body//div[3][@class]//span[12.2invalid input: '<'position()]/@id
  • data(//a['v' invalid input: '<' @id])
  • Field Details

  • Constructor Details

    • XPather

      public XPather(String expression)
      Constructor - creates XPather instance with specified XPath expression.
      Parameters:
      expression -
  • Method Details

    • evaluateAgainstNode

      public Object[] evaluateAgainstNode(TagNode node) throws XPatherException
      Main public method for this class - a way to execute XPath expression against specified TagNode instance.
      Parameters:
      node -
      Throws:
      XPatherException
    • throwStandardException

      private void throwStandardException() throws XPatherException
      Throws:
      XPatherException
    • evaluateAgainst

      protected Collection evaluateAgainst(Collection object, int from, int to, boolean isRecursive, int position, int last, boolean isFilterContext, Collection filterSource) throws XPatherException
      Throws:
      XPatherException
    • flatten

      private String flatten(int from, int to)
    • isValidInteger

      private static boolean isValidInteger(String value)
    • isValidDouble

      private boolean isValidDouble(String value)
    • isIdentifier

      private boolean isIdentifier(String s)
      Checks if given string is valid identifier.
      Parameters:
      s -
    • isFunctionCall

      private boolean isFunctionCall(int from, int to)
      Checks if tokens in specified range represents valid function call.
      Parameters:
      from -
      to -
      Returns:
      True if it is valid function call, false otherwise.
    • evaluateFunction

      protected Collection evaluateFunction(Collection source, int from, int to, int position, int last, boolean isFilterContext) throws XPatherException
      Evaluates specified function. Currently, following XPath functions are supported: last, position, text, count, data
      Parameters:
      source -
      from -
      to -
      position -
      last -
      Returns:
      Collection as the result of evaluation.
      Throws:
      XPatherException
    • filterByCondition

      protected Collection filterByCondition(Collection source, int from, int to) throws XPatherException
      Filter nodes satisfying the condition
      Parameters:
      source -
      from -
      to -
      Throws:
      XPatherException
    • isToken

      private boolean isToken(String token, int index)
    • findClosingIndex

      private int findClosingIndex(int from, int to)
      Parameters:
      from -
      to -
      Returns:
      matching closing index in the token array for the current token, or -1 if there is no closing token within expected bounds.
    • isAtt

      private boolean isAtt(String token)
      Checks if token is attribute (starts with @)
      Parameters:
      token -
    • singleton

      private Collection singleton(Object element)
      Creates one-element collection for the specified object.
      Parameters:
      element -
    • getElementsByName

      private Collection getElementsByName(Collection source, int from, int to, boolean isRecursive, boolean isFilterContext) throws XPatherException
      For the given source collection and specified name, returns collection of subnodes or attribute values.
      Parameters:
      source -
      from -
      to -
      isRecursive -
      Returns:
      Colection of TagNode instances or collection of String instances.
      Throws:
      XPatherException
    • evaluateLogic

      protected boolean evaluateLogic(Collection first, Collection second, String logicOperator)
      Evaluates logic operation on two collections.
      Parameters:
      first -
      second -
      logicOperator -
      Returns:
      Result of logic operation
    • toText

      private String toText(Object o)