Python Environment
In order to run your job, you need a reliable way to setup your Python environment. You first need to decide on the right Python version, then pick a package management system to install dependencies.
Python version
Generally speaking, try to pick one version behind the latest version that has been out at least for a quarter of a year. This will help you avoid all kinds of compatibility and stability problems. At the time of writing (October 12, 2021), this would be 3.8 (the latest version is 3.10, but it’s only been out for about a week (October 4, 2021), so choose one version behind 3.9).
Package management system
There are two major package management systems for Python: pip and conda. There are some crucial differences between them, but ultimately they both do their jobs quite well, so it really comes down to your personal preference. You can read more about your the similarities and differences between them in this blog post.
How to set it up
pip
ssh
into your login machine.- Clone your repository to your preferred location (somewhere in your home directory,
/scratch
, etc.). - Load your Python version using
module load
.module avail
would show you all the available modules.- For example, at the time of writing,
python/3.8.7
is available, so you’d issuemodule load python/3.8.7
.
- Create a virtual environment by issuing
virtualenv venv
. This will create a directory calledvenv
where your virtual environment would exist. - Activate your virtual environment by issuing
source venv/bin/activate
.- You can deactivate it by issuing
deactivate
.
- You can deactivate it by issuing
- While your virtual environment is activated, use
pip install
commands to install your desired packages.
Then, your job script should include the following commands before your main command:
module load python/3.8.7
source venv/bin/activate
conda
ssh
into your login machine.- Clone your repository to your preferred location (somewhere in your home directory,
/scratch
, etc.). - Load your Python version using
module load
.module avail
would show you all the available modules.- For example, at the time of writing,
python3.8-anaconda/2021.05
is available, so you’d issuemodule load python3.8-anaconda/2021.05
.
- Create a virtual environment by issuing
conda create -n <env-name>
.- Note that if you want to use another Python version, you shouldn’t load another version of conda, rather you’d load the same version of conda, and create a virtual environment with a different Python version by running
conda create -n <env-name> python=3.6
- Note that if you want to use another Python version, you shouldn’t load another version of conda, rather you’d load the same version of conda, and create a virtual environment with a different Python version by running
- Activate your virtual environment by issuing
conda activate <env-name>
.- You can deactivate it by issuing
conda deactivate
.
- You can deactivate it by issuing
- While your virtual environment is activated, you can use both
conda install
orpip install
to install your desired packages.- Do be careful about conflicts if you use both
conda
andpip
.
- Do be careful about conflicts if you use both
Troubleshooting
If you are having trouble activating your envrionment through the slurm .sh script, try adding these lines:
For python3.10-anaconda
module load python3.10-anaconda
source activate base
conda activate <env-name>
For python3.8-anaconda/2021.05
module load python3.8-anaconda/2021.05
# we need to source this script due to this issue: https://github.com/conda/conda/issues/7980
source /sw/arcts/centos7/python3.8-anaconda/2021.05/etc/profile.d/conda.sh
conda activate <env-name>