Class NPClusteringKeywordExtractor

  • All Implemented Interfaces:
    Extractor, KeywordExtractor, Serializable

    public class NPClusteringKeywordExtractor
    extends Object
    implements KeywordExtractor

    Implementation of the NP Clustering Keyword Extractor presented in:

       Bracewell, David B., Yan, Jiajun, and Ren, Fuji, (2008), Single Document Keyword Extraction For Internet News
     Articles, International Journal of Innovative Computing, Information and Control, 4, 905—913
     

    Author:
    David B. Bracewell
    See Also:
    Serialized Form
    • Constructor Detail

      • NPClusteringKeywordExtractor

        public NPClusteringKeywordExtractor()
    • Method Detail

      • extract

        public Extraction extract​(@NonNull
                                  @NonNull HString source)
        Description copied from interface: Extractor
        Generate an Extraction from the given HString.
        Specified by:
        extract in interface Extractor
        Parameters:
        source - the source text from which we will generate an Extraction
        Returns:
        the Extraction
      • fit

        public void fit​(DocumentCollection corpus)
        Description copied from interface: KeywordExtractor
        In certain cases a keyword extractor needs to collect corpus level statistics or construct a model of what a good keyword looks like. The fit method allows implementations to perform this logic at a corpus level.
        Specified by:
        fit in interface KeywordExtractor
        Parameters:
        corpus - the corpus to fit the extractor to