capreolus.tokenizer
¶
Package Contents¶
Classes¶
Tokenizer |
Base class for Tokenizer modules. The purpose of a Tokenizer is to tokenize strings of text (e.g., as required by an Extractor ). |
-
class
capreolus.tokenizer.
Tokenizer
(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶ Bases:
capreolus.ModuleBase
Base class for Tokenizer modules. The purpose of a Tokenizer is to tokenize strings of text (e.g., as required by an
Extractor
).- Modules should provide:
- a
tokenize(strings)
method that takes a list of strings and returns tokenized versions
- a