Package com.itextpdf.text.pdf.languages
Class ArabicLigaturizer
- java.lang.Object
-
- com.itextpdf.text.pdf.languages.ArabicLigaturizer
-
- All Implemented Interfaces:
LanguageProcessor
public class ArabicLigaturizer extends java.lang.Object implements LanguageProcessor
Shape arabic characters. This code was inspired by an LGPL'ed C library: Pango ( see http://www.pango.com/ ). Note that the code of this class is the original work of Paulo Soares.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
ArabicLigaturizer.charstruct
-
Field Summary
Fields Modifier and Type Field Description private static char
ALEF
private static char
ALEFHAMZA
private static char
ALEFHAMZABELOW
private static char
ALEFMADDA
private static char
ALEFMAKSURA
static int
ar_composedtashkeel
static int
ar_lig
static int
ar_nothing
static int
ar_novowel
private static char[][]
chartable
private static char
DAMMA
static int
DIGIT_TYPE_AN
Digit type option: Use Arabic-Indic digits (U+0660...U+0669).static int
DIGIT_TYPE_AN_EXTENDED
Digit type option: Use Eastern (Extended) Arabic-Indic digits (U+06f0...U+06f9).static int
DIGIT_TYPE_MASK
Bit mask for digit type options.static int
DIGITS_AN2EN
Digit shaping option: Replace Arabic-Indic digits by European digits (U+0030...U+0039).static int
DIGITS_EN2AN
Digit shaping option: Replace European digits (U+0030...U+0039) by Arabic-Indic digits.static int
DIGITS_EN2AN_INIT_AL
Digit shaping option: Replace European digits (U+0030...U+0039) by Arabic-Indic digits if the most recent strongly directional character is an Arabic letter (its Bidi direction value is RIGHT_TO_LEFT_ARABIC).static int
DIGITS_EN2AN_INIT_LR
Digit shaping option: Replace European digits (U+0030...U+0039) by Arabic-Indic digits if the most recent strongly directional character is an Arabic letter (its Bidi direction value is RIGHT_TO_LEFT_ARABIC).static int
DIGITS_MASK
Bit mask for digit shaping options.private static int
DIGITS_RESERVED
Not a valid option value.private static char
FARSIYEH
private static char
FATHA
private static char
HAMZA
private static char
HAMZAABOVE
private static char
HAMZABELOW
private static char
KASRA
private static char
LAM
private static char
LAM_ALEF
private static char
LAM_ALEFHAMZA
private static char
LAM_ALEFHAMZABELOW
private static char
LAM_ALEFMADDA
private static char
MADDA
private static java.util.HashMap<java.lang.Character,char[]>
maptable
protected int
options
private static java.util.HashMap<java.lang.Character,java.lang.Character>
reverseLigatureMapTable
Some fonts do not implement ligaturized variations on Arabic characters e.g.protected int
runDirection
private static char
SHADDA
private static char
TATWEEL
private static char
WAW
private static char
WAWHAMZA
private static char
YEH
private static char
YEHHAMZA
private static char
ZWJ
-
Constructor Summary
Constructors Constructor Description ArabicLigaturizer()
ArabicLigaturizer(int runDirection, int options)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static int
arabic_shape(char[] src, int srcoffset, int srclength, char[] dest, int destoffset, int destlength, int level)
(package private) static char
charshape(char s, int which)
(package private) static boolean
connects_to_left(ArabicLigaturizer.charstruct a)
(package private) static void
copycstostring(java.lang.StringBuffer string, ArabicLigaturizer.charstruct s, int level)
(package private) static void
doublelig(java.lang.StringBuffer string, int level)
static java.lang.Character
getReverseMapping(char c)
boolean
isRTL()
Arabic is written from right to left.(package private) static boolean
isVowel(char s)
(package private) static int
ligature(char newchar, ArabicLigaturizer.charstruct oldchar)
java.lang.String
process(java.lang.String s)
Processes a Stringstatic void
processNumbers(char[] text, int offset, int length, int options)
(package private) static void
shape(char[] text, java.lang.StringBuffer string, int level)
(package private) static int
shapecount(char s)
(package private) static void
shapeToArabicDigitsWithContext(char[] dest, int start, int length, char digitBase, boolean lastStrongWasAL)
-
-
-
Field Detail
-
maptable
private static final java.util.HashMap<java.lang.Character,char[]> maptable
-
reverseLigatureMapTable
private static final java.util.HashMap<java.lang.Character,java.lang.Character> reverseLigatureMapTable
Some fonts do not implement ligaturized variations on Arabic characters e.g. Simplified Arabic has got code point 0xFEED but not 0xFEEE
-
ALEF
private static final char ALEF
- See Also:
- Constant Field Values
-
ALEFHAMZA
private static final char ALEFHAMZA
- See Also:
- Constant Field Values
-
ALEFHAMZABELOW
private static final char ALEFHAMZABELOW
- See Also:
- Constant Field Values
-
ALEFMADDA
private static final char ALEFMADDA
- See Also:
- Constant Field Values
-
LAM
private static final char LAM
- See Also:
- Constant Field Values
-
HAMZA
private static final char HAMZA
- See Also:
- Constant Field Values
-
TATWEEL
private static final char TATWEEL
- See Also:
- Constant Field Values
-
ZWJ
private static final char ZWJ
- See Also:
- Constant Field Values
-
HAMZAABOVE
private static final char HAMZAABOVE
- See Also:
- Constant Field Values
-
HAMZABELOW
private static final char HAMZABELOW
- See Also:
- Constant Field Values
-
WAWHAMZA
private static final char WAWHAMZA
- See Also:
- Constant Field Values
-
YEHHAMZA
private static final char YEHHAMZA
- See Also:
- Constant Field Values
-
WAW
private static final char WAW
- See Also:
- Constant Field Values
-
ALEFMAKSURA
private static final char ALEFMAKSURA
- See Also:
- Constant Field Values
-
YEH
private static final char YEH
- See Also:
- Constant Field Values
-
FARSIYEH
private static final char FARSIYEH
- See Also:
- Constant Field Values
-
SHADDA
private static final char SHADDA
- See Also:
- Constant Field Values
-
KASRA
private static final char KASRA
- See Also:
- Constant Field Values
-
FATHA
private static final char FATHA
- See Also:
- Constant Field Values
-
DAMMA
private static final char DAMMA
- See Also:
- Constant Field Values
-
MADDA
private static final char MADDA
- See Also:
- Constant Field Values
-
LAM_ALEF
private static final char LAM_ALEF
- See Also:
- Constant Field Values
-
LAM_ALEFHAMZA
private static final char LAM_ALEFHAMZA
- See Also:
- Constant Field Values
-
LAM_ALEFHAMZABELOW
private static final char LAM_ALEFHAMZABELOW
- See Also:
- Constant Field Values
-
LAM_ALEFMADDA
private static final char LAM_ALEFMADDA
- See Also:
- Constant Field Values
-
chartable
private static final char[][] chartable
-
ar_nothing
public static final int ar_nothing
- See Also:
- Constant Field Values
-
ar_novowel
public static final int ar_novowel
- See Also:
- Constant Field Values
-
ar_composedtashkeel
public static final int ar_composedtashkeel
- See Also:
- Constant Field Values
-
ar_lig
public static final int ar_lig
- See Also:
- Constant Field Values
-
DIGITS_EN2AN
public static final int DIGITS_EN2AN
Digit shaping option: Replace European digits (U+0030...U+0039) by Arabic-Indic digits.- See Also:
- Constant Field Values
-
DIGITS_AN2EN
public static final int DIGITS_AN2EN
Digit shaping option: Replace Arabic-Indic digits by European digits (U+0030...U+0039).- See Also:
- Constant Field Values
-
DIGITS_EN2AN_INIT_LR
public static final int DIGITS_EN2AN_INIT_LR
Digit shaping option: Replace European digits (U+0030...U+0039) by Arabic-Indic digits if the most recent strongly directional character is an Arabic letter (its Bidi direction value is RIGHT_TO_LEFT_ARABIC). The initial state at the start of the text is assumed to be not an Arabic, letter, so European digits at the start of the text will not change. Compare to DIGITS_ALEN2AN_INIT_AL.- See Also:
- Constant Field Values
-
DIGITS_EN2AN_INIT_AL
public static final int DIGITS_EN2AN_INIT_AL
Digit shaping option: Replace European digits (U+0030...U+0039) by Arabic-Indic digits if the most recent strongly directional character is an Arabic letter (its Bidi direction value is RIGHT_TO_LEFT_ARABIC). The initial state at the start of the text is assumed to be an Arabic, letter, so European digits at the start of the text will change. Compare to DIGITS_ALEN2AN_INT_LR.- See Also:
- Constant Field Values
-
DIGITS_RESERVED
private static final int DIGITS_RESERVED
Not a valid option value.- See Also:
- Constant Field Values
-
DIGITS_MASK
public static final int DIGITS_MASK
Bit mask for digit shaping options.- See Also:
- Constant Field Values
-
DIGIT_TYPE_AN
public static final int DIGIT_TYPE_AN
Digit type option: Use Arabic-Indic digits (U+0660...U+0669).- See Also:
- Constant Field Values
-
DIGIT_TYPE_AN_EXTENDED
public static final int DIGIT_TYPE_AN_EXTENDED
Digit type option: Use Eastern (Extended) Arabic-Indic digits (U+06f0...U+06f9).- See Also:
- Constant Field Values
-
DIGIT_TYPE_MASK
public static final int DIGIT_TYPE_MASK
Bit mask for digit type options.- See Also:
- Constant Field Values
-
options
protected int options
-
runDirection
protected int runDirection
-
-
Method Detail
-
isVowel
static boolean isVowel(char s)
-
charshape
static char charshape(char s, int which)
-
shapecount
static int shapecount(char s)
-
ligature
static int ligature(char newchar, ArabicLigaturizer.charstruct oldchar)
-
copycstostring
static void copycstostring(java.lang.StringBuffer string, ArabicLigaturizer.charstruct s, int level)
-
doublelig
static void doublelig(java.lang.StringBuffer string, int level)
-
connects_to_left
static boolean connects_to_left(ArabicLigaturizer.charstruct a)
-
shape
static void shape(char[] text, java.lang.StringBuffer string, int level)
-
arabic_shape
public static int arabic_shape(char[] src, int srcoffset, int srclength, char[] dest, int destoffset, int destlength, int level)
-
processNumbers
public static void processNumbers(char[] text, int offset, int length, int options)
-
shapeToArabicDigitsWithContext
static void shapeToArabicDigitsWithContext(char[] dest, int start, int length, char digitBase, boolean lastStrongWasAL)
-
getReverseMapping
public static java.lang.Character getReverseMapping(char c)
-
process
public java.lang.String process(java.lang.String s)
Description copied from interface:LanguageProcessor
Processes a String- Specified by:
process
in interfaceLanguageProcessor
- Parameters:
s
- the original String- Returns:
- the processed String
-
isRTL
public boolean isRTL()
Arabic is written from right to left.- Specified by:
isRTL
in interfaceLanguageProcessor
- Returns:
- true
- See Also:
LanguageProcessor.isRTL()
-
-