[slurm-users] Job real running time contains EIO time for srun --time

Chenyang Yan memory.yancy at gmail.com
Wed Jan 4 11:24:45 UTC 2023


Hi, all, I have met a problem about job running time. My job running
time test script:
```
[root at slurmctl tmp]# cat test.sh
#!/bin/bash
cleanup()
{
  local now=$(date '+%s')
  echo "now: $(date -d "@$now")"
  echo "difference(start_time-now): $((now - start_time))"
}
trap cleanup EXIT
start_time=$(date '+%s')
echo "sleep 90 second ... start_time: $start_time $(date -d "@$start_time")"
sleep 90
```

Submit job with srun(21.08.6 version):
```
[root at slurmctl tmp]# sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug        up 5-00:00:00      2   idle c[3-4]
normal*      up 5-00:00:00      2   idle c[1-2]
# Expected difference is 60, but getting different value
[root at slurmctl tmp]# srun  --time '0-00:01:00' --nodelist c1 bash test.sh
sleep 90 second ... start_time: 1672823271 Wed Jan  4 17:07:51 CST 2023
srun: Force Terminated job 4
srun: Job step aborted: Waiting up to 5 seconds for job step to finish.
slurmstepd: error: *** STEP 4.0 ON c1 CANCELLED AT 2023-01-04T17:09:00
DUE TO TIME LIMIT ***
test.sh: line 6: 26797 Terminated              sleep 90
now: Wed Jan  4 17:09:00 CST 2023
difference(start_time-now): 69           <==========
srun: error: c1: task 0: Terminated

[root at slurmctl tmp]# srun  --time '0-00:01:00' --nodelist c1 bash test.sh
sleep 90 second ... start_time: 1672823359 Wed Jan  4 17:09:19 CST 2023
srun: Force Terminated job 5
srun: Job step aborted: Waiting up to 5 seconds for job step to finish.
slurmstepd: error: *** STEP 5.0 ON c1 CANCELLED AT 2023-01-04T17:10:30
DUE TO TIME LIMIT ***
test.sh: line 6: 26879 Terminated              sleep 90
now: Wed Jan  4 17:10:30 CST 2023
difference(start_time-now): 71    <==========
srun: error: c1: task 0: Terminated
```

I add `-vvvvvvv` to see debug message for srun, and I look through
some code about extra waiting time explanation and have added debug
message for explanation what to do. After compiling and make install,
I have saw that `poll()` is waiting IO(since timeout is -1 and means
an infinite timeout)

```
[root at slurmctl slurm-repo]# git branch -vvvv
* (no branch) 45e8cd46f2 Update META for v21.08.6 release.

[root at slurmctl slurm-repo]# git diff --unified=1
diff --git a/src/common/eio.c b/src/common/eio.c
index f50d01bc9f..ceb425ae31 100644
--- a/src/common/eio.c
+++ b/src/common/eio.c
@@ -313,2 +313,3 @@ int eio_handle_mainloop(eio_handle_t *eio)
                shutdown_time = eio->shutdown_time;
+        debug4("---------------- debug shutdown_time: %d || %d and
then start poll", shutdown_time, eio->shutdown_wait);
                slurm_mutex_unlock(&eio->shutdown_mutex);
@@ -316,2 +317,3 @@ int eio_handle_mainloop(eio_handle_t *eio)
                        goto error;
+        debug4("---------------- debug shutdown_time: %d || %d and
then finish poll", shutdown_time, eio->shutdown_wait);
```

https://github.com/SchedMD/slurm/blob/slurm-20-11-8-1/src/api/step_io.c
... ...
cio->eio = eio_handle_create(slurm_conf.eio_timeout);
... ...

https://github.com/SchedMD/slurm/blob/slurm-20-11-8-1/src/common/eio.c
... ...
/* Get shutdown_time to pass to _poll_internal */
slurm_mutex_lock(&eio->shutdown_mutex);
shutdown_time = eio->shutdown_time;
slurm_mutex_unlock(&eio->shutdown_mutex);
if (_poll_internal(pollfds, nfds, shutdown_time) < 0)
goto error;
... ...

```
[root at slurmctl slurm-repo]# scontrol show config | grep -E -i
"killwait|eiotimeout"
EioTimeout              = 1
KillWait                = 3 sec

[root at slurmctl tmp]# srun -vvvvvvv  --time '0-00:01:00' --nodelist c1
bash test.sh
... ...
srun: debug:  Entering _msg_thr_create()
srun: debug4: eio: handling events for 2 objects
srun: debug3: eio_message_socket_readable: shutdown 0 fd 11
srun: debug3: eio_message_socket_readable: shutdown 0 fd 7
srun: debug4: ---------------- debug shutdown_time: 0 || 1 and then start poll
srun: debug:  initialized stdio listening socket, port 34940
srun: debug:  Started IO server thread (140320727525120)
srun: debug:  Entering _launch_tasks
srun: debug3: IO thread pid = 3164
... ...

srun: debug2: Entering _file_write
sleep 90 second ... start_time: 1672823800 Wed Jan  4 17:16:40 CST 2023
srun: debug3:   wrote 72 bytes
srun: debug2: Leaving  _file_write
... ....
srun: debug3: received timeout message
srun: job time limit to be reached at Wed Jan 04 17:17:40 2023   <===========
srun: debug4: eio: handling events for 1 objects
... ...
srun: debug4: ---------------- debug shutdown_time: 0 || 1 and then start poll
                                                               srun:
debug4: ---------------- debug shutdown_time: 0 || 1 and then finish
poll    <===== waiting for a moment
srun: debug3: eio_message_socket_accept: start
srun: debug2: eio_message_socket_accept: got message connection from
172.17.0.2:33268 16
srun: debug3: received timeout message
srun: debug4: eio: handling events for 1 objects
srun: debug3: eio_message_socket_readable: shutdown 0 fd 4
srun: debug4: ---------------- debug shutdown_time: 0 || 1 and then start poll
srun: debug4: ---------------- debug shutdown_time: 0 || 1 and then finish poll
srun: debug3: eio_message_socket_accept: start
srun: debug2: eio_message_socket_accept: got message connection from
172.17.0.2:39052 16
srun: debug2: received job step timeout message
srun: debug4: eio: handling events for 2 objects
srun: debug3: eio_message_socket_readable: shutdown 0 fd 11
srun: debug3: eio_message_socket_readable: shutdown 0 fd 7
srun: debug4: ---------------- debug shutdown_time: 0 || 1 and then start poll
                                                  srun: debug4:
---------------- debug shutdown_time: 0 || 1 and then finish poll
<===== waiting for a moment
... ...
```

So:
* How to set an non-infinite timeout for _poll_internal()? Support
configuration?
* Or how to avoid extra EIO event time cost for job running time for
--time, there is a bug for --time option?

Thanks!



More information about the slurm-users mailing list