capreolus.sampler
¶
Package Contents¶
Classes¶
Sampler |
Base class for profane modules. |
TrainTripletSampler |
Samples training data triplets. Each samples is of the form (query, relevant doc, non-relevant doc) |
TrainPairSampler |
Samples training data pairs. Each sample is of the form (query, doc) |
PredSampler |
Creates a Dataset for evaluation (test) data to be used with a pytorch DataLoader |
-
class
capreolus.sampler.
Sampler
(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶ Bases:
capreolus.ModuleBase
Base class for profane modules. Module construction proceeds as follows: 1) Any config options not present in config are filled in with their default values. Config options and their defaults are specified in the config_spec class attribute. 2) Any dependencies declared in the dependencies class attribute are recursively instantiated. If the dependency object is present in provide, this object will be used instead of instantiating a new object for the dependency. 3) The module object’s config variable is updated to reflect the configs of its dependencies and then frozen.
After construction is complete, the module’s dependencies are available as instance variables: self.`dependency key`.
Parameters: - config – dictionary containing a config to apply to this module and its dependencies
- provide – dictionary mapping dependency keys to module objects
- share_dependency_objects – if true, dependencies will be cached in the registry based on their configs and reused. See the share_objects argument of ModuleBase.create.
-
prepare
(self, qid_to_docids, qrels, extractor, relevance_level=1, **kwargs)[source]¶ params: qid_to_docids: A dict of the form {qid: [list of docids to rank]} qrels: A dict of the form {qid: {docid: label}} extractor: An Extractor instance (eg: EmbedText) relevance_level: Threshold score below which documents are considered to be non-relevant.
-
class
capreolus.sampler.
TrainTripletSampler
(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶ Bases:
capreolus.sampler.Sampler
,torch.utils.data.IterableDataset
Samples training data triplets. Each samples is of the form (query, relevant doc, non-relevant doc)
-
class
capreolus.sampler.
TrainPairSampler
(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶ Bases:
capreolus.sampler.Sampler
,torch.utils.data.IterableDataset
Samples training data pairs. Each sample is of the form (query, doc) The number of generate positive and negative samples are the same.
-
class
capreolus.sampler.
PredSampler
(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶ Bases:
capreolus.sampler.Sampler
,torch.utils.data.IterableDataset
Creates a Dataset for evaluation (test) data to be used with a pytorch DataLoader