:mod:`capreolus.extractor` ========================== .. py:module:: capreolus.extractor Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 bagofwords/index.rst deeptileextractor/index.rst Package Contents ---------------- Classes ~~~~~~~ .. autoapisummary:: capreolus.extractor.Extractor capreolus.extractor.EmbedText capreolus.extractor.BertText .. data:: logger .. py:class:: Extractor Bases: :class:`profane.ModuleBase` .. attribute:: module_type :annotation: = extractor .. method:: cache_state(self, qids, docids) .. method:: load_state(self, qids, docids) .. method:: get_state_cache_file_path(self, qids, docids) Returns the path to the cache file used to store the extractor state, regardless of whether it exists or not .. method:: is_state_cached(self, qids, docids) Returns a boolean indicating whether the state corresponding to the qids and docids passed has already been cached .. method:: build_from_benchmark(self, *args, **kwargs) .. py:class:: EmbedText Bases: :class:`capreolus.extractor.Extractor` .. attribute:: module_name :annotation: = embedtext .. attribute:: requires_random_seed :annotation: = True .. attribute:: dependencies .. attribute:: config_spec .. attribute:: pad :annotation: = 0 .. attribute:: pad_tok :annotation: = .. attribute:: embed_paths .. method:: load_state(self, qids, docids) .. method:: cache_state(self, qids, docids) .. method:: get_tf_feature_description(self) .. method:: create_tf_feature(self, sample) sample - output from self.id2vec() return - a tensorflow feature .. method:: parse_tf_example(self, example_proto) .. method:: exist(self) .. method:: preprocess(self, qids, docids, topics) .. method:: id2vec(self, qid, posid, negid=None) .. py:class:: BertText Bases: :class:`capreolus.extractor.Extractor` .. attribute:: module_name :annotation: = berttext .. attribute:: dependencies .. attribute:: config_spec .. attribute:: pad :annotation: = 0 .. attribute:: pad_tok :annotation: = .. staticmethod:: config() .. method:: load_state(self, qids, docids) .. method:: cache_state(self, qids, docids) .. method:: get_tf_feature_description(self) .. method:: create_tf_feature(self, sample) sample - output from self.id2vec() return - a tensorflow feature .. method:: parse_tf_example(self, example_proto) .. method:: exist(self) .. method:: preprocess(self, qids, docids, topics) .. method:: id2vec(self, qid, posid, negid=None) .. method:: get_mask(self, doc, to_len) Returns a mask where it is 1 for actual toks and 0 for pad toks