Class BidiOrder

java.lang.Object
com.aowagie.text.pdf.BidiOrder

final class BidiOrder extends Object
Reference implementation of the Unicode 3.0 Bidi algorithm.

This implementation is not optimized for performance. It is intended as a reference implementation that closely follows the specification of the Bidirectional Algorithm in The Unicode Standard version 3.0.

Input:
There are two levels of input to the algorithm, since clients may prefer to supply some information from out-of-band sources rather than relying on the default behavior.

  1. unicode type array
  2. unicode type array, with externally supplied base line direction

Output:
Output is separated into several stages as well, to better enable clients to evaluate various aspects of implementation conformance.

  1. levels array over entire paragraph
  2. reordering array over entire paragraph
  3. levels array over line
  4. reordering array over line
Note that for conformance, algorithms are only required to generate correct reordering and character directionality (odd or even levels) over a line. Generating identical level arrays over a line is not required. Bidi explicit format codes (LRE, RLE, LRO, RLO, PDF) and BN can be assigned arbitrary levels and positions as long as the other text matches.

As the algorithm is defined to operate on a single paragraph at a time, this implementation is written to handle single paragraphs. Thus rule P1 is presumed by this implementation-- the data provided to the implementation is assumed to be a single paragraph, and either contains no 'B' codes, or a single 'B' code at the end of the input. 'B' is allowed as input to illustrate how the algorithm assigns it a level.

Also note that rules L3 and L4 depend on the rendering engine that uses the result of the bidi algorithm. This implementation assumes that the rendering engine expects combining marks in visual order (e.g. to the left of their base character in RTL runs) and that it adjust the glyphs used to render mirrored characters that are in RTL runs so that they render appropriately.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final byte
    Right-to-Left Arabic
    private static final byte
    Arabic Number
    private static final byte
    Paragraph Separator
    private static char[]
     
    private static final byte
    Boundary Neutral
    private static final byte
    Common Number Separator
    private byte[]
     
    static final byte
    European Number
    private static final byte
    European Number Separator
    private static final byte
    European Number Terminator
    private final byte[]
     
    static final byte
    Left-to-right
    private static final byte
    Left-to-Right Embedding
    private static final byte
    Left-to-Right Override
    private static final byte
    Non-Spacing Mark
    private static final byte
    Other Neutrals
    private byte
     
    private static final byte
    Pop Directional Format
    static final byte
    Right-to-Left
    private byte[]
     
    private byte[]
     
    private static final byte
    Right-to-Left Embedding
    private static final byte
    Right-to-Left Override
    private static final byte[]
     
    private static final byte
    Segment Separator
    private int
     
    private static final byte
    Maximum bidi type value.
    private static final byte
    Minimum bidi type value.
    private static final byte
    Whitespace
  • Constructor Summary

    Constructors
    Constructor
    Description
    BidiOrder(char[] text, int offset, int length, byte paragraphEmbeddingLevel)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private static int[]
    computeMultilineReordering(byte[] levels, int[] linebreaks)
    Return multiline reordering array for a given level array.
    private static int[]
    computeReordering(byte[] levels)
    Return reordering array for a given level array.
    private void
    Process embedding format codes.
    private void
    1) determining the paragraph level.
    private int
    findRunLimit(int index, int limit, byte[] validSet)
    Return the limit of the run starting at index that includes only resultTypes in validSet.
    private int
    findRunStart(int index, byte[] validSet)
    Return the start of the run including index that includes only resultTypes in validSet.
    byte
    Return the base level of the paragraph.
    (package private) static final byte
    getDirection(char c)
     
    byte[]
     
    private byte[]
    getLevels(int[] linebreaks)
    Return levels array breaking lines at offsets in linebreaks.
    private static boolean
    isWhitespace(byte biditype)
    Return true if the type is considered a whitespace type for the line break rules.
    private static byte[]
    processEmbeddings(byte[] resultTypes, byte paragraphEmbeddingLevel)
    2) determining explicit levels Rules X1 - X8 The interaction of these rules makes handling them a bit complex.
    private int
    reinsertExplicitCodes(int textLength)
    Reinsert levels information for explicit codes.
    private int
    Rules X9.
    private void
    resolveImplicitLevels(int start, int limit, byte level, byte sor, byte eor)
    7) resolving implicit embedding levels Rules I1, I2.
    private void
    resolveNeutralTypes(int start, int limit, byte level, byte sor, byte eor)
    6) resolving neutral types Rules N1-N2.
    private void
    resolveWeakTypes(int start, int limit, byte level, byte sor, byte eor)
    3) resolving weak types Rules W1-W7.
    private void
    The algorithm.
    private void
    setLevels(int start, int limit, byte newLevel)
    Set resultLevels from start up to (but not including) limit to newLevel.
    private void
    setTypes(int start, int limit, byte newType)
    Set resultTypes from start up to (but not including) limit to newType.
    private static byte
    typeForLevel(int level)
    Return the strong type (L or R) corresponding to the level.
    private static void
    validateLineBreaks(int[] linebreaks, int textLength)
    Throw exception if line breaks array is invalid.
    private static void
    validateParagraphEmbeddingLevel(byte paragraphEmbeddingLevel)
    Throw exception if paragraph embedding level is invalid.
    private static void
    validateTypes(byte[] types)
    Throw exception if type array is invalid.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • BidiOrder

      BidiOrder(char[] text, int offset, int length, byte paragraphEmbeddingLevel)
  • Method Details

    • getDirection

      static final byte getDirection(char c)
    • runAlgorithm

      private void runAlgorithm()
      The algorithm. Does not include line-based processing (Rules L1, L2). These are applied later in the line-based phase of the algorithm.
    • determineParagraphEmbeddingLevel

      private void determineParagraphEmbeddingLevel()
      1) determining the paragraph level.

      Rules P2, P3.

      At the end of this function, the member variable paragraphEmbeddingLevel is set to either 0 or 1.

    • determineExplicitEmbeddingLevels

      private void determineExplicitEmbeddingLevels()
      Process embedding format codes.

      Calls processEmbeddings to generate an embedding array from the explicit format codes. The embedding overrides in the array are then applied to the result types, and the result levels are initialized.

      See Also:
    • removeExplicitCodes

      private int removeExplicitCodes()
      Rules X9. Remove explicit codes so that they may be ignored during the remainder of the main portion of the algorithm. The length of the resulting text is returned.
      Returns:
      the length of the data excluding explicit codes and BN.
    • reinsertExplicitCodes

      private int reinsertExplicitCodes(int textLength)
      Reinsert levels information for explicit codes. This is for ease of relating the level information to the original input data. Note that the levels assigned to these codes are arbitrary, they're chosen so as to avoid breaking level runs.
      Parameters:
      textLength - the length of the data after compression
      Returns:
      the length of the data (original length of types array supplied to constructor)
    • processEmbeddings

      private static byte[] processEmbeddings(byte[] resultTypes, byte paragraphEmbeddingLevel)
      2) determining explicit levels Rules X1 - X8 The interaction of these rules makes handling them a bit complex. This examines resultTypes but does not modify it. It returns embedding and override information in the result array. The low 7 bits are the level, the high bit is set if the level is an override, and clear if it is an embedding.
    • resolveWeakTypes

      private void resolveWeakTypes(int start, int limit, byte level, byte sor, byte eor)
      3) resolving weak types Rules W1-W7. Note that some weak types (EN, AN) remain after this processing is complete.
    • resolveNeutralTypes

      private void resolveNeutralTypes(int start, int limit, byte level, byte sor, byte eor)
      6) resolving neutral types Rules N1-N2.
    • resolveImplicitLevels

      private void resolveImplicitLevels(int start, int limit, byte level, byte sor, byte eor)
      7) resolving implicit embedding levels Rules I1, I2.
    • getLevels

      public byte[] getLevels()
    • getLevels

      private byte[] getLevels(int[] linebreaks)
      Return levels array breaking lines at offsets in linebreaks.
      Rule L1.

      The returned levels array contains the resolved level for each bidi code passed to the constructor.

      The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.

      Parameters:
      linebreaks - the offsets at which to break the paragraph
      Returns:
      the resolved levels of the text
    • computeMultilineReordering

      private static int[] computeMultilineReordering(byte[] levels, int[] linebreaks)
      Return multiline reordering array for a given level array. Reordering does not occur across a line break.
    • computeReordering

      private static int[] computeReordering(byte[] levels)
      Return reordering array for a given level array. This reorders a single line. The reordering is a visual to logical map. For example, the leftmost char is string.charAt(order[0]). Rule L2.
    • getBaseLevel

      public byte getBaseLevel()
      Return the base level of the paragraph.
    • isWhitespace

      private static boolean isWhitespace(byte biditype)
      Return true if the type is considered a whitespace type for the line break rules.
    • typeForLevel

      private static byte typeForLevel(int level)
      Return the strong type (L or R) corresponding to the level.
    • findRunLimit

      private int findRunLimit(int index, int limit, byte[] validSet)
      Return the limit of the run starting at index that includes only resultTypes in validSet. This checks the value at index, and will return index if that value is not in validSet.
    • findRunStart

      private int findRunStart(int index, byte[] validSet)
      Return the start of the run including index that includes only resultTypes in validSet. This assumes the value at index is valid, and does not check it.
    • setTypes

      private void setTypes(int start, int limit, byte newType)
      Set resultTypes from start up to (but not including) limit to newType.
    • setLevels

      private void setLevels(int start, int limit, byte newLevel)
      Set resultLevels from start up to (but not including) limit to newLevel.
    • validateTypes

      private static void validateTypes(byte[] types)
      Throw exception if type array is invalid.
    • validateParagraphEmbeddingLevel

      private static void validateParagraphEmbeddingLevel(byte paragraphEmbeddingLevel)
      Throw exception if paragraph embedding level is invalid. Special allowance for -1 so that default processing can still be performed when using this API.
    • validateLineBreaks

      private static void validateLineBreaks(int[] linebreaks, int textLength)
      Throw exception if line breaks array is invalid.