Class ArabicNormalizer

java.lang.Object
org.apache.lucene.analysis.ar.ArabicNormalizer

class ArabicNormalizer extends Object
Normalizer for Arabic.

Normalization is done in-place for efficiency, operating on a termbuffer.

Normalization is defined as:

  • Normalization of hamza with alef seat to a bare alef.
  • Normalization of teh marbuta to heh
  • Normalization of dotless yeh (alef maksura) to yeh.
  • Removal of Arabic diacritics (the harakat)
  • Removal of tatweel (stretching character).
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
    (package private) static final char
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    (package private) int
    normalize(char[] s, int len)
    Normalize an input buffer of Arabic text

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait