Package com.gengoai.hermes.similarity
Class ExtractorBasedSimilarity
- java.lang.Object
-
- com.gengoai.hermes.similarity.ExtractorBasedSimilarity
-
- All Implemented Interfaces:
HStringSimilarity
public class ExtractorBasedSimilarity extends Object implements HStringSimilarity
An implementation of an
HStringSimilarity
that uses an Apollo Similarity measure to determine the similarity between twoHString
based on the extraction from a givenExtractor
.
-
-
Constructor Summary
Constructors Constructor Description ExtractorBasedSimilarity(@NonNull Similarity measure)
Instantiates a new TokenSimilarity using aTermExtractor
that ignores stop words, and converts HString to their lemma form.ExtractorBasedSimilarity(@NonNull Similarity measure, @NonNull Extractor termExtractor)
Instantiates a new TokenSimilarity.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
calculate(@NonNull HString first, @NonNull HString second)
Calculates the similarity between the two givenHString
void
fit(@NonNull DocumentCollection corpus)
In certain cases a HStringSimilarity needs to collect corpus level statistics to determine similarity.
-
-
-
Constructor Detail
-
ExtractorBasedSimilarity
public ExtractorBasedSimilarity(@NonNull @NonNull Similarity measure)
Instantiates a new TokenSimilarity using aTermExtractor
that ignores stop words, and converts HString to their lemma form.- Parameters:
measure
- the similarity measure to use
-
ExtractorBasedSimilarity
public ExtractorBasedSimilarity(@NonNull @NonNull Similarity measure, @NonNull @NonNull Extractor termExtractor)
Instantiates a new TokenSimilarity.- Parameters:
measure
- the similarity measure to usetermExtractor
- the extractor to use to generate extractions for calculating simialrity
-
-
Method Detail
-
calculate
public double calculate(@NonNull @NonNull HString first, @NonNull @NonNull HString second)
Description copied from interface:HStringSimilarity
Calculates the similarity between the two givenHString
- Specified by:
calculate
in interfaceHStringSimilarity
- Parameters:
first
- the first HStringsecond
- the second HString- Returns:
- the similarity between
first
andsecond
-
fit
public void fit(@NonNull @NonNull DocumentCollection corpus)
Description copied from interface:HStringSimilarity
In certain cases a HStringSimilarity needs to collect corpus level statistics to determine similarity. The fit method allows implementations to perform this logic at a corpus level.- Specified by:
fit
in interfaceHStringSimilarity
- Parameters:
corpus
- the corpus to fit the similarity measure to
-
-