[2020-10-23T14:17:35.519] task_p_slurmd_batch_request: 2547451 [2020-10-23T14:17:35.519] task/affinity: job 2547451 CPU input mask for node: 0x0000000000000003 [2020-10-23T14:17:35.519] task/affinity: job 2547451 CPU final HW mask for node: 0x0000000100000001 [2020-10-23T14:17:35.525] _run_prolog: run job script took usec=6094 [2020-10-23T14:17:35.525] _run_prolog: prolog with lock for job 2547451 ran for 0 seconds [2020-10-23T14:17:35.565] [2547451.extern] task affinity plugin loaded with CPU mask 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffffffffffff [2020-10-23T14:17:35.569] [2547451.extern] Munge cryptographic signature plugin loaded [2020-10-23T14:17:35.594] [2547451.extern] task/cgroup: /slurm/uid_3176402/job_2547451: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:35.594] [2547451.extern] task/cgroup: /slurm/uid_3176402/job_2547451/step_extern: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:35.595] Launching batch job 2547451 for UID 3176402 [2020-10-23T14:17:35.611] [2547451.batch] task affinity plugin loaded with CPU mask 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffffffffffff [2020-10-23T14:17:35.612] [2547451.batch] Munge cryptographic signature plugin loaded [2020-10-23T14:17:35.621] [2547451.batch] task/cgroup: /slurm/uid_3176402/job_2547451: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:35.621] [2547451.batch] task/cgroup: /slurm/uid_3176402/job_2547451/step_batch: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:35.776] [2547451.batch] debug level = 2 [2020-10-23T14:17:35.776] [2547451.batch] starting 1 tasks [2020-10-23T14:17:35.777] [2547451.batch] task 0 (19889) started 2020-10-23T14:17:35 [2020-10-23T14:17:35.778] [2547451.batch] task_p_pre_launch: Using sched_affinity for tasks [2020-10-23T14:17:35.822] sbcast req_uid=3176402 job_id=2547451 fname=/data/gpfs/home/shubhamp/resDB/ifconfig.txt block_no=1 [2020-10-23T14:17:40.925] launch task 2547451.0 request from UID:3176402 GID:3176402 HOST:172.19.4.4 PORT:18832 [2020-10-23T14:17:40.925] lllp_distribution jobid [2547451] implicit auto binding: cores, dist 1 [2020-10-23T14:17:40.925] _task_layout_lllp_cyclic [2020-10-23T14:17:40.925] _lllp_generate_cpu_bind jobid [2547451]: mask_cpu, 0x0000000100000001 [2020-10-23T14:17:40.941] [2547451.0] task affinity plugin loaded with CPU mask 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffffffffffff [2020-10-23T14:17:40.942] [2547451.0] Munge cryptographic signature plugin loaded [2020-10-23T14:17:40.951] [2547451.0] task/cgroup: /slurm/uid_3176402/job_2547451: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:40.952] [2547451.0] task/cgroup: /slurm/uid_3176402/job_2547451/step_0: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:40.965] [2547451.0] debug level = 2 [2020-10-23T14:17:40.965] [2547451.0] starting 1 tasks [2020-10-23T14:17:40.965] [2547451.0] task 0 (19934) started 2020-10-23T14:17:40 [2020-10-23T14:17:40.967] [2547451.0] task_p_pre_launch: Using sched_affinity for tasks [2020-10-23T14:17:42.969] [2547451.0] task 0 (19934) exited with exit code 0. [2020-10-23T14:17:42.980] [2547451.0] done with job [2020-10-23T14:17:42.998] launch task 2547451.1 request from UID:3176402 GID:3176402 HOST:172.19.4.4 PORT:22416 [2020-10-23T14:17:42.998] lllp_distribution jobid [2547451] auto binding off: mask_cpu [2020-10-23T14:17:43.016] [2547451.1] task affinity plugin loaded with CPU mask 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffffffffffff [2020-10-23T14:17:43.018] [2547451.1] Munge cryptographic signature plugin loaded [2020-10-23T14:17:43.025] [2547451.1] task/cgroup: /slurm/uid_3176402/job_2547451: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:43.026] [2547451.1] task/cgroup: /slurm/uid_3176402/job_2547451/step_1: alloc=65536MB mem.limit=65536MB memsw.limit=65536MB [2020-10-23T14:17:43.038] [2547451.1] debug level = 2 [2020-10-23T14:17:43.038] [2547451.1] starting 1 tasks [2020-10-23T14:17:43.038] [2547451.1] task 0 (19956) started 2020-10-23T14:17:43 [2020-10-23T14:17:43.039] [2547451.1] task_p_pre_launch: Using sched_affinity for tasks [2020-10-23T14:30:22.573] [2547451.1] Sent signal 18 to 2547451.1 [2020-10-23T14:30:22.610] [2547451.batch] Sent signal 18 to 2547451.4294967294 [2020-10-23T14:30:22.654] [2547451.extern] Sent signal 18 to 2547451.4294967295 [2020-10-23T14:30:22.694] [2547451.1] error: *** STEP 2547451.1 ON cpu-3 CANCELLED AT 2020-10-23T14:30:22 *** [2020-10-23T14:30:22.694] [2547451.1] Sent signal 15 to 2547451.1 [2020-10-23T14:30:22.700] [2547451.batch] error: *** JOB 2547451 ON cpu-3 CANCELLED AT 2020-10-23T14:30:22 *** [2020-10-23T14:30:22.700] [2547451.batch] Sent signal 15 to 2547451.4294967294 [2020-10-23T14:30:22.701] [2547451.batch] task 0 (19889) exited. Killed by signal 15. [2020-10-23T14:30:22.703] [2547451.extern] Sent signal 15 to 2547451.4294967295 [2020-10-23T14:30:23.787] [2547451.extern] _oom_event_monitor: oom-kill event count: 1 [2020-10-23T14:30:23.791] [2547451.extern] done with job [2020-10-23T14:30:23.800] [2547451.batch] job 2547451 completed with slurm_rc = 0, job_rc = 15 [2020-10-23T14:30:23.800] [2547451.batch] sending REQUEST_COMPLETE_BATCH_SCRIPT, error:0 status 15 [2020-10-23T14:30:23.802] [2547451.batch] done with job [2020-10-23T14:30:52.384] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:30:52.392] [2547451.1] task 0 (19956) exited. Killed by signal 9. [2020-10-23T14:30:54.058] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:30:55.059] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:30:56.060] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:30:57.061] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:30:58.061] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:30:59.062] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:00.063] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:01.064] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:02.065] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:03.066] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:13.067] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:23.068] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:33.069] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:43.070] [2547451.1] Sent SIGKILL signal to 2547451.1 [2020-10-23T14:31:53.000] [2547451.1] error: *** STEP 2547451.1 STEPD TERMINATED ON cpu-3 AT 2020-10-23T14:31:52 DUE TO JOB NOT ENDING WITH SIGNALS *** [2020-10-23T14:31:58.000] [2547451.1] done with job