Interface Lemmatizer

  • All Known Implementing Classes:
    ENLemmatizer

    public interface Lemmatizer

    Defines the interface for lemmatizing tokens.

    Author:
    David B. Bracewell
    • Method Detail

      • lemmatize

        default String lemmatize​(@NonNull
                                 @NonNull String string)
        Determines the best lemma for a string
        Parameters:
        string - the string to lemmatize
        Returns:
        the lemmatized version of the string
      • lemmatize

        default String lemmatize​(@NonNull
                                 @NonNull String string,
                                 @NonNull
                                 @NonNull PartOfSpeech partOfSpeech)
        Determines the best lemma for a string given a part of speech
        Parameters:
        string - the string
        partOfSpeech - the part of speech
        Returns:
        the lemmatized version of the string
      • allPossibleLemmas

        List<String> allPossibleLemmas​(String string,
                                       PartOfSpeech partOfSpeech)
        Gets all lemmas.
        Parameters:
        string - the string
        partOfSpeech - the part of speech
        Returns:
        the all lemmas
      • allPossibleLemmasAndPrefixes

        Trie<String> allPossibleLemmasAndPrefixes​(String string,
                                                  PartOfSpeech partOfSpeech)
        Gets prefixed lemmas.
        Parameters:
        string - the string
        partOfSpeech - the part of speech
        Returns:
        the prefixed lemmas
      • canLemmatize

        boolean canLemmatize​(String input,
                             PartOfSpeech partOfSpeech)
        Can lemmatize boolean.
        Parameters:
        input - the input
        partOfSpeech - the part of speech
        Returns:
        the boolean
      • lemmatize

        default String lemmatize​(@NonNull
                                 @NonNull HString fragment)
        Lemmatizes a token.
        Parameters:
        fragment - the fragment to lemmatize
        Returns:
        the lemmatized version of the token