capreolus.extractor.bertpassage
¶
Module Contents¶
Classes¶
Extracts passages from the document to be later consumed by a BERT based model. |
Attributes¶
- class capreolus.extractor.bertpassage.BertPassage(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
capreolus.extractor.Extractor
,capreolus.extractor.common.SingleTrainingPassagesMixin
Extracts passages from the document to be later consumed by a BERT based model. Does NOT use all the passages. The first passages is always used. Use the prob config to control the probability of a passage being selected Gotcha: In Tensorflow the train tfrecords have shape (batch_size, maxseqlen) while dev tf records have the shape (batch_size, num_passages, maxseqlen). This is because during inference, we want to pool over the scores of the passages belonging to a doc