Package com.gengoai.hermes.format.conll
Class IOBFieldProcessor
- java.lang.Object
-
- com.gengoai.hermes.format.conll.IOBFieldProcessor
-
- All Implemented Interfaces:
CoNLLColumnProcessor
- Direct Known Subclasses:
NamedEntityProcessor
,PhraseChunkProcessor
,SuperSenseProcessor
public abstract class IOBFieldProcessor extends Object implements CoNLLColumnProcessor
Base processor for IOB (Inside, Outside, Beginning) annotations in CoNLL Files- Author:
- David B. Bracewell
-
-
Constructor Summary
Constructors Constructor Description IOBFieldProcessor(AnnotationType annotationType, AttributeType<?> attributeType)
Instantiates a new IOBFieldProcessor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected String
normalizeTag(String tag)
Normalize tag string.void
processInput(Document document, List<CoNLLRow> rows, Map<Tuple2<Integer,Integer>,Long> sentenceIndexToAnnotationId)
Processes a set of CoNLL rows making up a documentString
processOutput(HString document, Annotation token, int index)
Generates output data in CoNLL format-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.gengoai.hermes.format.CoNLLColumnProcessor
getFieldName, updateRow
-
-
-
-
Constructor Detail
-
IOBFieldProcessor
public IOBFieldProcessor(AnnotationType annotationType, AttributeType<?> attributeType)
Instantiates a new IOBFieldProcessor.- Parameters:
annotationType
- the annotation typeattributeType
- the attribute type
-
-
Method Detail
-
normalizeTag
protected String normalizeTag(String tag)
Normalize tag string.- Parameters:
tag
- the tag- Returns:
- the string
-
processInput
public void processInput(Document document, List<CoNLLRow> rows, Map<Tuple2<Integer,Integer>,Long> sentenceIndexToAnnotationId)
Description copied from interface:CoNLLColumnProcessor
Processes a set of CoNLL rows making up a document- Specified by:
processInput
in interfaceCoNLLColumnProcessor
- Parameters:
document
- the documentrows
- the CoNLL rows making up the documentsentenceIndexToAnnotationId
- the index of the token in the sentence to annotation id
-
processOutput
public String processOutput(HString document, Annotation token, int index)
Description copied from interface:CoNLLColumnProcessor
Generates output data in CoNLL format- Specified by:
processOutput
in interfaceCoNLLColumnProcessor
- Parameters:
document
- theHString
representing the document.token
- the tokenindex
- the index- Returns:
- the string
-
-