[slurm-users] introduce short delay starting multiple parallel jobs with srun

Fri Nov 10 02:02:25 MST 2017

Thank you for suggestions, John.
I am a user though, I cant install anything.
The first problem is not even access to the data, but to a "job list" file. When I started working on these data a month or so ago, I got access a system with several compute nodes (and eventually also an HPC). So I thought of a way for several compute nodes to work on the same pool of data - create a job list file, which each node reads at the beginning, get the next available data set, flags it as taken, works on it, finishes and flags it as done, takes the next available, and so on. This was the only way I could figure out to make several nodes to work together.
Then I got access to an HPC, so I just use the same script and run it 20 times in parallel. Previously, I could start the jobs manually on each node, so I didnt have the problem with simultaneous access. Now, SLURM starts all tasks simultaneously, they all try to access that job list file, and problems occur. Sometimes, a task cant edit it, so it crashes. Sometimes, different tasks take the same data set, and then they crash too.
Maybe there is a better way? I dont know how else to make it work on lets say 20 data sets at the same time out of a 100 total? Job arrays can let me give it a lot of tasks, but work only at a certain number simultaneously, but that is not available on our SLURM.
I think I found a solution yesterday. I can use PID of the shell. I tried using PID of the process, but doing 20+ ps's is mess. Shell PID though is a shell variable $$, so I dont need to use any command, just last two digits as the delay!

Best,
Renat.

________________________________________
From: slurm-users [slurm-users-bounces at lists.schedmd.com] On Behalf Of John Hearns [hearnsj at gmail.com]
Sent: Thursday, November 09, 2017 4:39 PM
To: Slurm User Community List
Subject: Re: [slurm-users] introduce short delay starting multiple parallel jobs with srun

Renat,
   I know that this is not going to be helpful.  I can understand that perhaps if you are using NFS storage then 20(*) processes might not be able to open files at the same time.
I would consider the following:

a) looking at your storage. This is why HPC systems have high performance and parallel storage systems.
    You could consider isntalling a high performance storage system

b) if there is no option to get better storage, tne I ask how is this data being accessed?
    If you have multiple compute nodes, and the data is being read only, then consider copying the data across to TMPDIR on each compute node as a pre-job or at the start of the job.
If the speed of access to the data is critical then you might even consider creating a ramdisk for TMPDIR - then you might even see some nice better performance.

20 - err that does sound a bit low...

On 9 November 2017 at 15:55, Gennaro Oliva <oliva.g at na.icar.cnr.it<mailto:oliva.g at na.icar.cnr.it>> wrote:
Hi Renat,

On Thu, Nov 09, 2017 at 03:46:23PM +0100, Yakupov, Renat /DZNE wrote:
> I tried that. It doesnt even queue the job with an error:
> sbatch: unrecognized option '--array=1-24'
> sbatch: error: Try help for more information.

what version of slurm are you using?
Regards
--
Gennaro Oliva