log files use many strings to identify job, including not having a jobID in the message NUMBER=$SLURM_JOBID egrep ".<$NUMBER>] |<$NUMBER>.batch|jobid <$NUMBER>|JObId=<$NUMBER>|job id <$NUMBER>|job.<$NUMBER>|job <$NUMBER>|jobid [<$NUMBER>]|task_p_slurmd_batch_request: <$NUMBER>" /var/log/slurm*
Even that misses cruciall data that does not even contain the jobid
[2024-02-03T11:50:33.052] _get_user_env: get env for user jsu here [2024-02-03T11:52:33.152] timeout waiting for /bin/su to complete [2024-02-03T11:52:34.152] error: Failed to load current user environment variables [2024-02-03T11:52:34.153] error: _get_user_env: Unable to get user's local environment, running only with passed environment
It would be very useful if all messages related to a job had a consistent string in them for grepping the log files; even better might be a command like "scontrol show jobid=NNNN log_messages
But I could not find what I wanted (an easy way to find all daemon log messages related to a specific job). I would find it particularly useful if there were a way to automatically append such information to the stdout of the job at job termination so users would automatically get information about job failures or warnings.
Is there such a feature available I have missed?
Sent with [Proton Mail](https://proton.me/) secure email.