Installation¶
This section covers installing Capreolus via its pip package or from source.
Prerequisites¶
Capreolus requires both Python 3.7+ and Java 11.
The easiest way to install these dependencies is by using the Conda package manager as described in this guide to installing Miniconda and Python 3.
We recommend installing Capreolus into its own Conda environment using the provided environment.yml
file:
wget https://raw.githubusercontent.com/capreolus-ir/capreolus/master/environment.yml
conda env create --name MyCapreolus -f environment.yml
conda activate MyCapreolus
Installing Capreolus via pip¶
- Activate the appropriate environment (if using conda):
conda activate MyCapreolus
pip install capreolus
- You can now use Capreolus on the command line via the
capreolus
command
Configuring Capreolus¶
Capreolus uses environment variables to indicate where outputs should be stored and where document inputs can be found. Consult the list below to determine which variables should be set. Set these environment variables either on the fly (export CAPREOLUS_RESULTS=...
) before running Capreolus or by editing your shell’s initialization files (e.g., ~/.bashrc
or ~/.zshrc
).
CAPREOLUS_RESULTS
: directory where results are stored (default:~/.capreolus/results/
)CAPREOLUS_CACHE
: directory where cache files are stored (default:~/.capreolus/cache/
)CAPREOLUS_LOGGING
: Indicates the logging level:DEBUG
,INFO
(default),WARN
orERROR
CUDA_VISIBLE_DEVICES
: Indicates GPUs available to PyTorch, starting from 0. For example, setting to ‘1’ will use the system’s 2nd GPU (as numbered bynvidia-smi
). Set to “” (an empty string) to force CPU.
To avoid confusion and failed experiments due to limited disk space, we recommend always setting CAPREOLUS_RESULTS
and CAPREOLUS_CACHE
rather than relying on the default behavior. Typically, CUDA_VISIBLE_DEVICES
is set immediately before running an experiment (e.g., to run several separate experiments on different GPUs in parallel).
You’re now ready to run capreolus
.
Alternate installation approaches¶
This section describes alternate ways to install Capreolus. We strongly recommend installing via pip when possible (as described above).
Installing Capreolus from source¶
- Clone the Capreolus repository:
git clone https://github.com/capreolus-ir/capreolus
- You should now have a
capreolus
folder that contains various files as well as anothercapreolus
folder, which contains the actual capreolus Python package. This is a common layout for Python packages; the inside folder (i.e.,capreolus/capreolus
) corresponds to the Python package. cd capreolus
- Install PyTorch
pip install -r requirements.txt
- You can now use Capreolus on the command line via the
scripts/capreolus
command. Note that this only works from the outercapreolus
directory; you will need to adjust PYTHONPATH.