Class ViterbiAnnotator

  • All Implemented Interfaces:
    Serializable
    Direct Known Subclasses:
    FuzzyLexiconAnnotator

    public abstract class ViterbiAnnotator
    extends SentenceLevelAnnotator

    An abstract base annotator that uses the Viterbi algorithm to find text items in a document. Child classes implement the scoreSpan and createAndAttachAnnotation methods to score individual spans and attach to the document. Child implementations may also override combineScore to change how scores are combined, by default they are multiplied.

    Author:
    David B. Bracewell
    See Also:
    Serialized Form
    • Constructor Detail

      • ViterbiAnnotator

        protected ViterbiAnnotator​(int maxSpanSize)
        Default constructor
        Parameters:
        maxSpanSize - The maximum length that an identified span will be
    • Method Detail

      • combineScore

        protected double combineScore​(double currentScore,
                                      double spanScore)
        Combines the score of a possible span with that of the spans up to this point to determine the optimal segmentation.
        Parameters:
        currentScore - The score of the sentence so far
        spanScore - The score of the span under consideration
        Returns:
        The combination of the current and span scores
      • createAndAttachAnnotation

        protected abstract void createAndAttachAnnotation​(Document document,
                                                          LexiconMatch span)
        Given an possible span determines if an annotation should be created and if so creates and attaches it.
        Parameters:
        document - the document
        span - The span to check
      • scoreSpan

        protected abstract LexiconEntry scoreSpan​(HString span)
        Scores the given span.
        Parameters:
        span - The span
        Returns:
        The score of the span