Class SuffixingNGramTokenFilter

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    final class SuffixingNGramTokenFilter
    extends TokenFilter
    • Field Detail

      • suffix

        private final java.lang.String suffix
      • maxTokenLength

        private final int maxTokenLength
      • anyToken

        private final java.lang.String anyToken
      • curTermBuffer

        private char[] curTermBuffer
      • curTermLength

        private int curTermLength
      • curCodePointCount

        private int curCodePointCount
      • curGramSize

        private int curGramSize
      • curPos

        private int curPos
      • curPosInc

        private int curPosInc
      • curPosLen

        private int curPosLen
      • tokStart

        private int tokStart
      • tokEnd

        private int tokEnd
    • Constructor Detail

      • SuffixingNGramTokenFilter

        public SuffixingNGramTokenFilter​(TokenStream input,
                                         java.lang.String suffix,
                                         java.lang.String wildcardToken,
                                         int maxTokenLength)
        Creates SuffixingNGramTokenFilter.
        Parameters:
        input - TokenStream holding the input to be tokenized
        suffix - a string to suffix to all ngrams
        wildcardToken - a token to emit if the input token is longer than maxTokenLength
        maxTokenLength - tokens longer than this will not be ngrammed
    • Method Detail

      • incrementToken

        public final boolean incrementToken()
                                     throws java.io.IOException
        Returns the next token in the stream, or null at EOS.
        Specified by:
        incrementToken in class TokenStream
        Returns:
        false for end of stream; true otherwise
        Throws:
        java.io.IOException
      • reset

        public void reset()
                   throws java.io.IOException
        Description copied from class: TokenFilter
        This method is called by a consumer before it begins consumption using TokenStream.incrementToken().

        Resets this stream to a clean state. Stateful implementations must implement this method so that they can be reused, just as if they had been created fresh.

        If you override this method, always call super.reset(), otherwise some internal state will not be correctly reset (e.g., Tokenizer will throw IllegalStateException on further usage).

        NOTE: The default implementation chains the call to the input TokenStream, so be sure to call super.reset() when overriding this method.

        Overrides:
        reset in class TokenFilter
        Throws:
        java.io.IOException