[slurm-users] Multinode MPI job
Thomas M. Payerle
payerle at umd.edu
Wed Mar 27 18:01:39 UTC 2019
As partition CLUSTER is not in your /etc/slurm/parts file, it likely was
added via scontrol command.
Presumably you or a colleague created a CLUSTER partition, whether
intentionally or not.
scontrol show partition CLUSTER
to view it.
On Wed, Mar 27, 2019 at 1:44 PM Mahmood Naderan <mahmood.nt at gmail.com>
> So, it seems that it is not an easy thing at the moment!
> >Partitions are defined by the systems administrators, you'd need to
> >speak with them about their reasoning for those.
> Its me :)
> I haven't defined a partition named CLUSTER
> On Wed, Mar 27, 2019 at 8:42 PM Christopher Samuel <chris at csamuel.org>
>> On 3/27/19 8:39 AM, Mahmood Naderan wrote:
>> > mpirun pw.x -imos2.rlx.in <http://mos2.rlx.in>
>> You will need to read the documentation for this:
>> Especially note both of these:
>> IMPORTANT: The ability to execute a single application across more than
>> one job allocation does not work with all MPI implementations or Slurm
>> MPI plugins. Slurm's ability to execute such an application can be
>> disabled on the entire cluster by adding "disable_hetero_steps" to
>> Slurm's SchedulerParameters configuration parameter.
>> IMPORTANT: While the srun command can be used to launch heterogeneous
>> job steps, mpirun would require substantial modification to support
>> heterogeneous applications. We are aware of no such mpirun development
>> efforts at this time.
>> So at the very least you'll need to use srun, not mpirun and confirm
>> that the MPI you are using supports this Slurm feature.
>> > Also, the partition names are weird. We have these entries:
>> Partitions are defined by the systems administrators, you'd need to
>> speak with them about their reasoning for those.
>> All the best,
>> Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
DIT-ACIGS/Mid-Atlantic Crossroads payerle at umd.edu
5825 University Research Park (301) 405-6135
University of Maryland
College Park, MD 20740-3831
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users