public abstract class AnalyzerWrapper extends Analyzer
Analyzer
suitable for Analyzers which wrap
other Analyzers.
getWrappedAnalyzer(String)
allows the Analyzer
to wrap multiple Analyzers which are selected on a per field basis.
wrapComponents(String, Analyzer.TokenStreamComponents)
allows the
TokenStreamComponents of the wrapped Analyzer to then be wrapped
(such as adding a new TokenFilter
to form new TokenStreamComponents.
wrapReader(String, Reader)
allows the Reader of the wrapped
Analyzer to then be wrapped (such as adding a new CharFilter
.
Important: If you do not want to wrap the TokenStream
using wrapComponents(String, Analyzer.TokenStreamComponents)
or the Reader using wrapReader(String, Reader)
and just delegate
to other analyzers (like by field name), use DelegatingAnalyzerWrapper
as superclass!
DelegatingAnalyzerWrapper
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Modifier | Constructor and Description |
---|---|
protected |
AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
Creates a new AnalyzerWrapper with the given reuse strategy.
|
Modifier and Type | Method and Description |
---|---|
protected AttributeFactory |
fieldName)
|
protected Analyzer.TokenStreamComponents |
fieldName)
Creates a new
Analyzer.TokenStreamComponents instance for this analyzer. |
int |
fieldName)
Just like
Analyzer.getPositionIncrementGap(java.lang.String) , except for
Token offsets instead. |
int |
fieldName)
Invoked before indexing a IndexableField instance if
terms have already been added to that field.
|
protected abstract Analyzer |
fieldName)
Retrieves the wrapped Analyzer appropriate for analyzing the field with
the given name
|
|
fieldName,
reader)
Override this if you want to add a CharFilter chain.
|
protected |
fieldName,
reader)
Wrap the given with
CharFilter s that make sense
for normalization. |
protected TokenStream |
fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
protected Analyzer.TokenStreamComponents |
fieldName,
Analyzer.TokenStreamComponents components)
Wraps / alters the given TokenStreamComponents, taken from the wrapped
Analyzer, to form new components.
|
protected |
fieldName,
reader)
Wraps / alters the given Reader.
|
protected |
fieldName,
reader)
Wraps / alters the given Reader.
|
protected TokenStream |
fieldName,
TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken
from the wrapped Analyzer, to form new components.
|
close, getReuseStrategy, getVersion, normalize, setVersion, tokenStream, tokenStream
protected AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
If you want to wrap a single delegate Analyzer you can probably
reuse its strategy when instantiating this subclass:
super(delegate.getReuseStrategy());
.
If you choose different analyzers per field, use
Analyzer.PER_FIELD_REUSE_STRATEGY
.
Analyzer.getReuseStrategy()
protected abstract fieldName)
fieldName
- Name of the field which is to be analyzedprotected fieldName, Analyzer.TokenStreamComponents components)
fieldName
- Name of the field which is to be analyzedcomponents
- TokenStreamComponents taken from the wrapped Analyzerprotected fieldName, TokenStream in)
fieldName
- Name of the field which is to be analyzedin
- TokenStream taken from the wrapped Analyzerprotected wrapReader( fieldName, reader)
initReader(String, Reader)
. By default, the given reader
is returned.fieldName
- name of the field which is to be analyzedreader
- the reader to wrapprotected wrapReaderForNormalization( fieldName, reader)
initReaderForNormalization(String, Reader)
. By default,
the given reader is returned.fieldName
- name of the field which is to be analyzedreader
- the reader to wrapprotected final fieldName)
Analyzer
Analyzer.TokenStreamComponents
instance for this analyzer.createComponents
in class Analyzer
fieldName
- the name of the fields content passed to the
Analyzer.TokenStreamComponents
sink as a readerAnalyzer.TokenStreamComponents
for this analyzer.protected final fieldName, TokenStream in)
Analyzer
TokenStream
in order to apply normalization filters.
The default implementation returns the TokenStream
as-is. This is
used by Analyzer.normalize(String, String)
.public int getPositionIncrementGap( fieldName)
Analyzer
getPositionIncrementGap
in class Analyzer
fieldName
- IndexableField name being indexed.Analyzer.tokenStream(String,Reader)
.
This value must be >= 0
.public int getOffsetGap( fieldName)
Analyzer
Analyzer.getPositionIncrementGap(java.lang.String)
, except for
Token offsets instead. By default this returns 1.
This method is only called if the field
produced at least one token for indexing.getOffsetGap
in class Analyzer
fieldName
- the field just indexedAnalyzer.tokenStream(String,Reader)
.
This value must be >= 0
.public final initReader( fieldName, reader)
Analyzer
The default implementation returns reader
unchanged.
initReader
in class Analyzer
fieldName
- IndexableField name being indexedreader
- original Readerprotected final initReaderForNormalization( fieldName, reader)
Analyzer
CharFilter
s that make sense
for normalization. This is typically a subset of the CharFilter
s
that are applied in Analyzer.initReader(String, Reader)
. This is used by
Analyzer.normalize(String, String)
.initReaderForNormalization
in class Analyzer
protected final fieldName)
Analyzer
AttributeFactory
to be used for
analysis
and
normalization
on the given
FieldName
. The default implementation returns
TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY
.attributeFactory
in class Analyzer
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.