CustomAnalyzer (Lucene 8.9.0 API)

- org.apache.lucene.analysis.Analyzer
- - org.apache.lucene.analysis.custom.CustomAnalyzer

All Implemented Interfaces:

,
```
public final class CustomAnalyzer
extends Analyzer
```
A general-purpose Analyzer that can be created with a builder-style API. Under the hood it uses the factory classes TokenizerFactory, TokenFilterFactory, and CharFilterFactory.
You can create an instance of this Analyzer using the builder by passing the SPI names (as defined by interface) to it:
```
 Analyzer ana = CustomAnalyzer.builder(Paths.get("/path/to/config/dir"))
   .withTokenizer(StandardTokenizerFactory.NAME)
   .addTokenFilter(LowerCaseFilterFactory.NAME)
   .addTokenFilter(StopFilterFactory.NAME, "ignoreCase", "false", "words", "stopwords.txt", "format", "wordset")
   .build();
 
```
The parameters passed to components are also used by Apache Solr and are documented on their corresponding factory classes. Refer to documentation of subclasses of TokenizerFactory, TokenFilterFactory, and CharFilterFactory.
This is the same as the above:
```
 Analyzer ana = CustomAnalyzer.builder(Paths.get("/path/to/config/dir"))
   .withTokenizer("standard")
   .addTokenFilter("lowercase")
   .addTokenFilter("stop", "ignoreCase", "false", "words", "stopwords.txt", "format", "wordset")
   .build();
 
```
The list of names to be used for components can be looked up through: TokenizerFactory.availableTokenizers(), TokenFilterFactory.availableTokenFilters(), and CharFilterFactory.availableCharFilters().
You can create conditional branches in the analyzer by using CustomAnalyzer.Builder.when(String, String...) and CustomAnalyzer.Builder.whenTerm(Predicate):
```
 Analyzer ana = CustomAnalyzer.builder()
    .withTokenizer("standard")
    .addTokenFilter("lowercase")
    .whenTerm(t -> t.length() > 10)
      .addTokenFilter("reversestring")
    .endwhen()
    .build();
 
```
Since:

5.0.0

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`CustomAnalyzer.Builder` Builder for `CustomAnalyzer`.
`static class`	`CustomAnalyzer.ConditionBuilder` Factory class for a `ConditionalTokenFilter`

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents

Field Summary
- Fields inherited from class org.apache.lucene.analysis.Analyzer
  GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`static CustomAnalyzer.Builder`	`builder()` Returns a builder for custom analyzers that loads all resources from Lucene's classloader.
`static CustomAnalyzer.Builder`	`configDir)` Returns a builder for custom analyzers that loads all resources from the given file system base directory.
`static CustomAnalyzer.Builder`	`builder(ResourceLoader loader)` Returns a builder for custom analyzers that loads all resources using the given `ResourceLoader`.
`protected Analyzer.TokenStreamComponents`	`fieldName)`
`<CharFilterFactory>`	`getCharFilterFactories()` Returns the list of char filters that are used in this analyzer.
`int`	`fieldName)`
`int`	`fieldName)`
`<TokenFilterFactory>`	`getTokenFilterFactories()` Returns the list of token filters that are used in this analyzer.
`TokenizerFactory`	`getTokenizerFactory()` Returns the tokenizer that is used in this analyzer.
`protected`	`fieldName, reader)`
`protected`	`fieldName, reader)`
`protected TokenStream`	`fieldName, TokenStream in)`
	`toString()`

Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getReuseStrategy, getVersion, normalize, setVersion, tokenStream, tokenStream

Methods inherited from class java.lang.
, , , , , , , , ,

- Method Detail
  - builder
```
public static CustomAnalyzer.Builder builder()
```
    Returns a builder for custom analyzers that loads all resources from Lucene's classloader. All path names given must be absolute with package prefixes.
  - builder
```
public static  configDir)
```
    Returns a builder for custom analyzers that loads all resources from the given file system base directory. Place, e.g., stop word files there. Files that are not in the given directory are loaded from Lucene's classloader.
  - builder
```
public static CustomAnalyzer.Builder builder(ResourceLoader loader)
```
    Returns a builder for custom analyzers that loads all resources using the given ResourceLoader.
  - initReader
```
protected  initReader( fieldName,
                             reader)
```
    Overrides:
    
    initReader in class Analyzer
  - initReaderForNormalization
```
protected  initReaderForNormalization( fieldName,
                                             reader)
```
    Overrides:
    
    initReaderForNormalization in class Analyzer
  - createComponents
```
protected  fieldName)
```
    Specified by:
    
    createComponents in class Analyzer
  - normalize
```
protected  fieldName,
                                TokenStream in)
```
    Overrides:
    
    normalize in class Analyzer
  - getPositionIncrementGap
```
public int getPositionIncrementGap( fieldName)
```
    Overrides:
    
    getPositionIncrementGap in class Analyzer
  - getOffsetGap
```
public int getOffsetGap( fieldName)
```
    Overrides:
    
    getOffsetGap in class Analyzer
  - getCharFilterFactories
```
public <CharFilterFactory> getCharFilterFactories()
```
    Returns the list of char filters that are used in this analyzer.
  - getTokenizerFactory
```
public TokenizerFactory getTokenizerFactory()
```
    Returns the tokenizer that is used in this analyzer.
  - getTokenFilterFactories
```
public <TokenFilterFactory> getTokenFilterFactories()
```
    Returns the list of token filters that are used in this analyzer.
  - toString
```
public  toString()
```
    Overrides:
    
    in class

Class CustomAnalyzer

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer

Field Summary

Fields inherited from class org.apache.lucene.analysis.Analyzer

Method Summary

Methods inherited from class org.apache.lucene.analysis.Analyzer

Methods inherited from class java.lang.

Method Detail

builder

builder

builder

initReader

initReaderForNormalization

createComponents

normalize

getPositionIncrementGap

getOffsetGap

getCharFilterFactories

getTokenizerFactory

getTokenFilterFactories

toString