[slurm-users] lmod and slurm

Loris Bennett loris.bennett at fu-berlin.de
Tue Dec 19 06:11:52 MST 2017


Hi Yair,

Yair Yarom <irush at cs.huji.ac.il> writes:

> Hi list,
>
> We use here lmod[1] for some software/version management. There are two
> issues encountered (so far):
>
> 1. The submission node can have different software than the execution
>    nodes - different cpu, different gpu (if any), infiniband, etc. When
>    a user runs 'module load something' on the submission node, it will
>    pass the wrong environment to the task in the execution
>    node. e.g. "module load tensorflow" can load a different version
>    depending on the nodes.
>
> 2. There are some modules we want to load by default, and again this can
>    be different between nodes (we do this by source'ing /etc/lmod/lmodrc
>    and ~/.lmodrc).
>
> For issue 1, we instruct users to run the "module load" in their batch
> script and not before running sbatch, but issue 2 is more problematic.
>
> My current solution is to write a TaskProlog script that runs "module
> purge" and "module load" and export/unset the changed environment
> variables. I was wondering if anyone encountered this issue and have a
> less cumbersome solution.
>
> Thanks in advance,
>     Yair.
>
> [1] https://www.tacc.utexas.edu/research-development/tacc-projects/lmod

I don't fully understand your use-case, but, assuming you can divide
your nodes up by some feature, could you define a module per feature
which just loads the specific modules needed for that category, e.g. in
the batch file you would have

   #SBATCH --constraint=shiny_and_new

   module add ${SLURM_CONSTRAINT}

and would have a module file 'shiny_and_new', with contents like, say,

  module add tensorflow/2.0
  module add cuda/9.0

whereas the module 'rusty_and_old' would contain

  module add tensorflow/0.1
  module add cuda/0.2

Would that help?

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de



More information about the slurm-users mailing list