Interface Extraction

  • All Superinterfaces:
    Iterable<HString>, Serializable

    public interface Extraction
    extends Serializable, Iterable<HString>

    An extraction is the output generated by an Extractor. Extractions provide access to HString and String representations of the extracted content. Note that how the results are constructed are dependent on the extraction technique. For example, some extractions only provide fragments (i.e. non-attached) HString due to the way extraction is performed.

    Author:
    David B. Bracewell
    • Method Detail

      • fromCounter

        static Extraction fromCounter​(@NonNull
                                      @NonNull Counter<String> counter)
        Generates a Counter-based Extraction. Counter-based extractions do not have access to the original HString objects and thus generate fragments when needed.
        Parameters:
        counter - the count of extractions represented as a Strings.
        Returns:
        the Extraction
      • fromHStringList

        static Extraction fromHStringList​(@NonNull
                                          @NonNull List<HString> list,
                                          @NonNull
                                          @NonNull SerializableFunction<HString,​String> toString)
        Generates an HString-backed extraction with a provided function for converting the HString into string representations. Methods on the extraction object resulting in Strings will first map the underlying extracted HString using the provided toString method.
        Parameters:
        list - the list of extracted HString
        toString - the function to use to map the extracted HString into String representations.
        Returns:
        the Extraction
      • fromHStringList

        static Extraction fromHStringList​(@NonNull
                                          @NonNull List<HString> list,
                                          @NonNull
                                          @NonNull SerializableFunction<HString,​String> toString,
                                          ValueCalculator calculator)
        Generates an HString-backed extraction with a provided function for converting the HString into string representations. Methods on the extraction object resulting in Strings will first map the underlying extracted HString using the provided toString method. When needed to generate a counter representation, the provided ValueCalculator is used to transform the values.
        Parameters:
        list - the list of extracted HString
        toString - the function to use to map the extracted HString into String representations.
        calculator - the methodology for counting HString.
        Returns:
        the Extraction
      • fromHStringList

        static Extraction fromHStringList​(@NonNull
                                          @NonNull List<HString> list)
        Generates an HString-backed extraction which will use the HString.toString for mapping to Strings and will return raw counts for counter based methods.
        Parameters:
        list - the list of extracted HString
        Returns:
        the Extraction
      • fromStringList

        static Extraction fromStringList​(@NonNull
                                         @NonNull List<String> list)
        Generates an String-backed extraction. String-based extractions do not have access to the original HString objects and thus generate fragments when needed.
        Parameters:
        list - the list of extracted String
        Returns:
        the Extraction
      • fromStringList

        static Extraction fromStringList​(@NonNull
                                         @NonNull List<String> list,
                                         ValueCalculator calculator)
        Generates an String-backed extraction. String-based extractions do not have access to the original HString objects and thus generate fragments when needed. When needed to generate a counter representation, the provided ValueCalculator is used to transform the values.
        Parameters:
        list - the list of extracted String
        calculator - the methodology for counting String.
        Returns:
        the Extraction
      • count

        Counter<String> count()
        Generates a count over string representations of the extractions
        Returns:
        the counter
      • size

        int size()
        The number of items extracted
        Returns:
        the int
      • string

        Iterable<String> string()
        An iterable over the String representations of the extractions.
        Returns:
        the iterable