`capreolus.extractor.pooled_bertpassage`¶

Module Contents¶

Classes¶

PooledBertPassage Extracts passages from the document to be later consumed by a BERT based model.

capreolus.extractor.pooled_bertpassage.logger[source]¶

class capreolus.extractor.pooled_bertpassage.PooledBertPassage(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶

Bases: capreolus.extractor.bertpassage.BertPassage

Extracts passages from the document to be later consumed by a BERT based model. Different from BertPassage in the sense that all the passages from a document “stick together” during training - the resulting feature always have the shape (batch, num_passages, maxseqlen) - and this allows the reranker to pool over passages from the same document during training

module_name = pooledbertpassage[source]¶

dependencies[source]¶

pad = 0[source]¶

pad_tok = [PAD][source]¶

config_spec[source]¶

create_tf_train_feature(self, sample)[source]¶

Returns a set of features from a doc. Of the num_passages passages that are present in a document, we use only a subset of it. params: sample - A dict where each entry has the shape [batch_size, num_passages, maxseqlen]

Returns a list of features. Each feature is a dict, and each value in the dict has the shape [batch_size, maxseqlen]. Yes, the output shape is different to the input shape because we sample from the passages.

create_tf_dev_feature(self, sample)[source]¶: Unlike the train feature, the dev set uses all passages. Both the input and the output are dicts with the shape [batch_size, num_passages, maxseqlen]

parse_tf_train_example(self, example_proto)[source]¶

parse_tf_dev_example(self, example_proto)[source]¶

id2vec(self, qid, posid, negid=None, label=None)[source]¶: See parent class for docstring

capreolus.extractor.pooled_bertpassage¶

Module Contents¶

Classes¶

`capreolus.extractor.pooled_bertpassage`¶