capreolus.tokenizer
¶
Submodules¶
Package Contents¶
Classes¶
Base class for Tokenizer modules. The purpose of a Tokenizer is to tokenize strings of text (e.g., as required by an |
- class capreolus.tokenizer.Tokenizer(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
capreolus.ModuleBase
Base class for Tokenizer modules. The purpose of a Tokenizer is to tokenize strings of text (e.g., as required by an
Extractor
).- Modules should provide:
a
tokenize(strings)
method that takes a list of strings and returns tokenized versions