[slurm-users] Slurm 18.08.5 slurmctl error messages

Christopher Benjamin Coffey Chris.Coffey at nau.edu
Thu Jan 31 16:12:06 UTC 2019


Hi All,

This seems to be related to jobs that can't start due to in our case: 

AssocGrpMemRunMinutes, and AssocGrpCPURunMinutesLimit

Must be a bug relating to GrpTRESRunLimit it seems.

Best,
Chris

—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
 

On 1/31/19, 8:30 AM, "slurm-users on behalf of Christopher Benjamin Coffey" <slurm-users-bounces at lists.schedmd.com on behalf of Chris.Coffey at nau.edu> wrote:

    Hi, we upgraded to 18.08.5 this morning and are seeing odd errors in the slurmctld logs:
    
    [2019-01-31T08:24:13.684] error: select_nodes: calling _get_req_features() for JobId=16599048 with not NULL job resources
    [2019-01-31T08:24:13.685] error: select_nodes: calling _get_req_features() for JobId=16597556 with not NULL job resources
    [2019-01-31T08:24:13.685] error: select_nodes: calling _get_req_features() for JobId=16597557 with not NULL job resources
    
    Any ideas what this is about? It doesn't make sense to me. This is how job 16597557 looks;
    
    JobId=16597576 JobName=cred10_5_5_ci2_eu3_eu4_ciK_a
       UserId=abc123(3760) GroupId=cluster(3301) MCS_label=N/A
       Priority=123577 Nice=0 Account=afghah QOS=prof1
       JobState=PENDING Reason=AssocGrpMemRunMinutes Dependency=(null)
       Requeue=1 Restarts=1 BatchFlag=1 Reboot=0 ExitCode=0:0
       RunTime=00:00:00 TimeLimit=7-12:00:00 TimeMin=N/A
       SubmitTime=2019-01-30T17:17:53 EligibleTime=2019-01-30T17:19:54
       AccrueTime=2019-01-30T17:19:54
       StartTime=Unknown EndTime=Unknown Deadline=N/A
       PreemptTime=None SuspendTime=None SecsPreSuspend=0
       LastSchedEval=2019-01-31T08:26:00
       Partition=all AllocNode:Sid=wind:43691
       ReqNodeList=(null) ExcNodeList=(null)
       NodeList=(null)
       BatchHost=cn14
       NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
       TRES=cpu=1,mem=18400M,node=1,billing=1
       Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
       MinCPUsNode=1 MinMemoryNode=18400M MinTmpDiskNode=0
       Features=(null) DelayBoot=00:00:00
       OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
       Command=/scratch/abc123/class/credit_V2_ch_12/10_5_5_ci2_eu3_eu4_ciK_a.sh
       WorkDir=/scratch/abc123/class/credit_V2_12/10_5_5_ci2_eu3_eu4_ciK_a/
       StdErr=/scratch/abc123/class/credit_V2_12/10_5_5_ci2_eu3_eu4_ciK_a/output.txt
       StdIn=/dev/null
       StdOut=/scratch/abc123/class/credit_V2_12/10_5_5_ci2_eu3_eu4_ciK_a/output.txt
       Power=
    
    Best,
    Chris
    —
    Christopher Coffey
    High-Performance Computing
    Northern Arizona University
    928-523-1167
     
    
    



More information about the slurm-users mailing list