[slurm-users] I just had a "conversation" with ChatGPT about working DMTCP, OpenMPI and SLURM. Here are the results

Analabha Roy hariseldon99 at gmail.com
Sun Feb 19 20:42:44 UTC 2023


Hi,

Thanks for the advice. I already tried out mana, but at present it only
works with mpich, not openmpi, which is what I've setup via Ubuntu.


AR


On Sun, 19 Feb 2023, 02:10 Christopher Samuel, <chris at csamuel.org> wrote:

> On 2/10/23 11:06 am, Analabha Roy wrote:
>
> > I'm having some complex issues coordinating OpenMPI, SLURM, and DMTCP in
> > my cluster.
>
> If you're looking to try checkpointing MPI applications you may want to
> experiment with the MANA ("MPI-Agnostic, Network-Agnostic MPI") plugin
> for DMTCP here: https://github.com/mpickpt/mana
>
> We (NERSC) are collaborating with the developers and it is installed on
> Cori (our older Cray system) for people to experiment with. The
> documentation for it may be useful to others who'd like to try it out -
> it's got a nice description of how it works too which even I as a
> non-programmer can understand.
> https://docs.nersc.gov/development/checkpoint-restart/mana/
>
> Pay special attention to the caveats in our docs though!
>
> I've not used it myself, though I'm peripherally involved to give advice
> on system related issues.
>
> All the best,
> Chris
> --
> Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230220/e93d776a/attachment.htm>


More information about the slurm-users mailing list