capreolus.extractor

Package Contents

Classes

Extractor()
EmbedText()
BertText()
capreolus.extractor.logger[source]
class capreolus.extractor.Extractor[source]

Bases: profane.ModuleBase

module_type = extractor[source]
cache_state(self, qids, docids)[source]
load_state(self, qids, docids)[source]
get_state_cache_file_path(self, qids, docids)[source]

Returns the path to the cache file used to store the extractor state, regardless of whether it exists or not

is_state_cached(self, qids, docids)[source]

Returns a boolean indicating whether the state corresponding to the qids and docids passed has already been cached

build_from_benchmark(self, *args, **kwargs)[source]
class capreolus.extractor.EmbedText[source]

Bases: capreolus.extractor.Extractor

module_name = embedtext[source]
requires_random_seed = True[source]
dependencies[source]
config_spec[source]
pad = 0[source]
pad_tok = <pad>[source]
embed_paths[source]
load_state(self, qids, docids)[source]
cache_state(self, qids, docids)[source]
get_tf_feature_description(self)[source]
create_tf_feature(self, sample)[source]

sample - output from self.id2vec() return - a tensorflow feature

parse_tf_example(self, example_proto)[source]
exist(self)[source]
preprocess(self, qids, docids, topics)[source]
id2vec(self, qid, posid, negid=None)[source]
class capreolus.extractor.BertText[source]

Bases: capreolus.extractor.Extractor

module_name = berttext[source]
dependencies[source]
config_spec[source]
pad = 0[source]
pad_tok = <pad>[source]
static config()[source]
load_state(self, qids, docids)[source]
cache_state(self, qids, docids)[source]
get_tf_feature_description(self)[source]
create_tf_feature(self, sample)[source]

sample - output from self.id2vec() return - a tensorflow feature

parse_tf_example(self, example_proto)[source]
exist(self)[source]
preprocess(self, qids, docids, topics)[source]
id2vec(self, qid, posid, negid=None)[source]
get_mask(self, doc, to_len)[source]

Returns a mask where it is 1 for actual toks and 0 for pad toks