Class TokenInfoMorphData
java.lang.Object
org.apache.lucene.analysis.ja.dict.TokenInfoMorphData
- All Implemented Interfaces:
JaMorphData
,MorphData
- Direct Known Subclasses:
UnknownMorphData
Morphological information for system dictionary.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final ByteBuffer
static final int
flag that the entry has baseform data.static final int
flag that the entry has pronunciation data.static final int
flag that the entry has reading data.private final String[]
private final String[]
private final String[]
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static int
baseFormOffset
(int wordId) getBaseForm
(int morphId, char[] surfaceForm, int off, int len) Get base form of wordgetInflectionForm
(int wordId) Get inflection form of tokensgetInflectionType
(int morphId) Get inflection type of tokensint
getLeftId
(int morphId) Get left id of specified wordgetPartOfSpeech
(int morphId) Get Part-Of-Speech of tokensgetPronunciation
(int morphId, char[] surface, int off, int len) Get pronunciation of tokensgetReading
(int morphId, char[] surface, int off, int len) Get reading of tokensint
getRightId
(int morphId) Get right id of specified wordint
getWordCost
(int morphId) Get word cost of specified wordprivate boolean
hasBaseFormData
(int wordId) private boolean
hasPronunciationData
(int wordId) private boolean
hasReadingData
(int wordId) private static void
populatePosDict
(DataInput in, int posSize, String[] posDict, String[] inflTypeDict, String[] inflFormDict) private int
pronunciationOffset
(int wordId) private int
readingOffset
(int wordId) private String
readString
(int offset, int length, boolean kana)
-
Field Details
-
buffer
-
posDict
-
inflTypeDict
-
inflFormDict
-
HAS_BASEFORM
public static final int HAS_BASEFORMflag that the entry has baseform data. otherwise it's not inflected (same as surface form)- See Also:
-
HAS_READING
public static final int HAS_READINGflag that the entry has reading data. otherwise reading is surface form converted to katakana- See Also:
-
HAS_PRONUNCIATION
public static final int HAS_PRONUNCIATIONflag that the entry has pronunciation data. otherwise pronunciation is the reading- See Also:
-
-
Constructor Details
-
TokenInfoMorphData
TokenInfoMorphData(ByteBuffer buffer, IOSupplier<InputStream> posResource) throws IOException - Throws:
IOException
-
-
Method Details
-
populatePosDict
private static void populatePosDict(DataInput in, int posSize, String[] posDict, String[] inflTypeDict, String[] inflFormDict) throws IOException - Throws:
IOException
-
getLeftId
public int getLeftId(int morphId) Description copied from interface:MorphData
Get left id of specified word -
getRightId
public int getRightId(int morphId) Description copied from interface:MorphData
Get right id of specified word- Specified by:
getRightId
in interfaceMorphData
- Returns:
- right id
-
getWordCost
public int getWordCost(int morphId) Description copied from interface:MorphData
Get word cost of specified word- Specified by:
getWordCost
in interfaceMorphData
- Returns:
- word's cost
-
getBaseForm
Description copied from interface:JaMorphData
Get base form of word- Specified by:
getBaseForm
in interfaceJaMorphData
- Parameters:
morphId
- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getReading
Description copied from interface:JaMorphData
Get reading of tokens- Specified by:
getReading
in interfaceJaMorphData
- Parameters:
morphId
- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
Description copied from interface:JaMorphData
Get Part-Of-Speech of tokens- Specified by:
getPartOfSpeech
in interfaceJaMorphData
- Parameters:
morphId
- word ID of token- Returns:
- Part-Of-Speech of the token
-
getPronunciation
Description copied from interface:JaMorphData
Get pronunciation of tokens- Specified by:
getPronunciation
in interfaceJaMorphData
- Parameters:
morphId
- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
Description copied from interface:JaMorphData
Get inflection type of tokens- Specified by:
getInflectionType
in interfaceJaMorphData
- Parameters:
morphId
- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
Description copied from interface:JaMorphData
Get inflection form of tokens- Specified by:
getInflectionForm
in interfaceJaMorphData
- Parameters:
wordId
- word ID of token- Returns:
- inflection form, or null
-
readingOffset
private int readingOffset(int wordId) -
pronunciationOffset
private int pronunciationOffset(int wordId) -
baseFormOffset
private static int baseFormOffset(int wordId) -
hasBaseFormData
private boolean hasBaseFormData(int wordId) -
hasReadingData
private boolean hasReadingData(int wordId) -
hasPronunciationData
private boolean hasPronunciationData(int wordId) -
readString
-