Class Splitter

java.lang.Object
org.simpleframework.xml.stream.Splitter
Direct Known Subclasses:
CamelCaseBuilder.Attribute, HyphenBuilder.Parser

abstract class Splitter extends Object
The Splitter object is used split up a string in to tokens that can be used to create a camel case or hyphenated text representation of the string. This will preserve acronyms and numbers and splits tokens by case and character type. Examples of how a string would be splitted are as follows.
 
    CamelCaseString = "Camel" "Case" "String"
    hyphenated-text = "hyphenated" "text"
    URLAcronym      = "URL" "acronym"
    RFC2616.txt     = "RFC" "2616" "txt"
 
 
By splitting strings in to individual words this allows the splitter to be used to assemble the words in a way that adheres to a specific style. Each style can then be applied to an XML document to give it a consistent format.
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected StringBuilder
    This is the string builder used to build the processed text.
    protected int
    This is the number of characters to be considered for use.
    protected int
    This is the current read offset of the text string.
    protected char[]
    This is the original text that is to be split in to words.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Splitter(String source)
    Constructor of the Splitter object.
  • Method Summary

    Modifier and Type
    Method
    Description
    private boolean
    This is used to extract a acronym from the source string.
    protected abstract void
    commit(char[] text, int off, int len)
    This is used to commit the provided text in to the style that is required.
    private boolean
    isDigit(char ch)
    This is used to determine if the provided string evaluates to a digit character.
    private boolean
    isLetter(char ch)
    This is used to determine if the provided string evaluates to a letter character.
    private boolean
    isSpecial(char ch)
    This is used to determine if the provided string evaluates to a symbol character.
    private boolean
    isUpper(char ch)
    This is used to determine if the provided string evaluates to an upper case letter.
    private boolean
    This is used to extract a number from the source string.
    protected abstract void
    parse(char[] text, int off, int len)
    This is used to parse the provided text in to the style that is required.
    This is used to process the internal string and convert it in to a styled string.
    private void
    This is used to extract a token from the source string.
    protected char
    toLower(char ch)
    This is used to convert the provided character to a lower case character.
    protected char
    toUpper(char ch)
    This is used to convert the provided character to an upper case character.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • builder

      protected StringBuilder builder
      This is the string builder used to build the processed text.
    • text

      protected char[] text
      This is the original text that is to be split in to words.
    • count

      protected int count
      This is the number of characters to be considered for use.
    • off

      protected int off
      This is the current read offset of the text string.
  • Constructor Details

    • Splitter

      public Splitter(String source)
      Constructor of the Splitter object. This is used to split the provided string in to individual words so that they can be assembled as a styled token, which can represent an XML attribute or element.
      Parameters:
      source - this is the source that is to be split
  • Method Details

    • process

      public String process()
      This is used to process the internal string and convert it in to a styled string. The styled string can then be used as an XML attribute or element providing a consistent format to the document that is being generated.
      Returns:
      the string that has been converted to a styled string
    • token

      private void token()
      This is used to extract a token from the source string. Once a token has been extracted the commit method is called to add it to the string being build. Each time this is called a token, if extracted, will be committed to the string. Before being committed the string is parsed for styling.
    • acronym

      private boolean acronym()
      This is used to extract a acronym from the source string. Once a token has been extracted the commit method is called to add it to the string being build. Each time this is called a token, if extracted, will be committed to the string.
      Returns:
      true if an acronym was extracted from the source
    • number

      private boolean number()
      This is used to extract a number from the source string. Once a token has been extracted the commit method is called to add it to the string being build. Each time this is called a token, if extracted, will be committed to the string.
      Returns:
      true if an number was extracted from the source
    • isLetter

      private boolean isLetter(char ch)
      This is used to determine if the provided string evaluates to a letter character. This delegates to Character so that the full range of unicode characters are considered.
      Parameters:
      ch - this is the character that is to be evaluated
      Returns:
      this returns true if the character is a letter
    • isSpecial

      private boolean isSpecial(char ch)
      This is used to determine if the provided string evaluates to a symbol character. This delegates to Character so that the full range of unicode characters are considered.
      Parameters:
      ch - this is the character that is to be evaluated
      Returns:
      this returns true if the character is a symbol
    • isDigit

      private boolean isDigit(char ch)
      This is used to determine if the provided string evaluates to a digit character. This delegates to Character so that the full range of unicode characters are considered.
      Parameters:
      ch - this is the character that is to be evaluated
      Returns:
      this returns true if the character is a digit
    • isUpper

      private boolean isUpper(char ch)
      This is used to determine if the provided string evaluates to an upper case letter. This delegates to Character so that the full range of unicode characters are considered.
      Parameters:
      ch - this is the character that is to be evaluated
      Returns:
      this returns true if the character is upper case
    • toUpper

      protected char toUpper(char ch)
      This is used to convert the provided character to an upper case character. This delegates to Character to perform the conversion so unicode characters are considered.
      Parameters:
      ch - this is the character that is to be converted
      Returns:
      the character converted to upper case
    • toLower

      protected char toLower(char ch)
      This is used to convert the provided character to a lower case character. This delegates to Character to perform the conversion so unicode characters are considered.
      Parameters:
      ch - this is the character that is to be converted
      Returns:
      the character converted to lower case
    • parse

      protected abstract void parse(char[] text, int off, int len)
      This is used to parse the provided text in to the style that is required. Manipulation of the text before committing it ensures that the text adheres to the required style.
      Parameters:
      text - this is the text buffer to acquire the token from
      off - this is the offset in the buffer token starts at
      len - this is the length of the token to be parsed
    • commit

      protected abstract void commit(char[] text, int off, int len)
      This is used to commit the provided text in to the style that is required. Committing the text to the buffer assembles the tokens resulting in a complete token.
      Parameters:
      text - this is the text buffer to acquire the token from
      off - this is the offset in the buffer token starts at
      len - this is the length of the token to be committed