Hello everyone,
I've grepped the manual pages and crawled the 'net, but couldn't find any answer to the following problem:
I can see that the ctld keeps a record of it below /var/spool/slurm - as long as the job is running or waiting (and shown by "squeue") - and that this stores the environment that contains SLURM_SUBMIT_HOST - but this information seems to be lost when the job finishes.
Is there a way to find out what the value of SLURM_SUBMIT_HOST was? I'd be interested in a few more env variables, but this one should be sufficient for a start...
Is "sacct" just lacking a job field, or is this info indeed dropped and not stored in the DB?
Thanks, Steffen
That looks to be the case from my glance at sacct. Not everything in scontrol show job ends up in sacct, which is a bit frustrating at times.
-Paul Edmon-
On 8/7/2024 8:08 AM, Steffen Grunewald via slurm-users wrote:
Hello everyone,
I've grepped the manual pages and crawled the 'net, but couldn't find any answer to the following problem:
I can see that the ctld keeps a record of it below /var/spool/slurm - as long as the job is running or waiting (and shown by "squeue") - and that this stores the environment that contains SLURM_SUBMIT_HOST
- but this information seems to be lost when the job finishes.
Is there a way to find out what the value of SLURM_SUBMIT_HOST was? I'd be interested in a few more env variables, but this one should be sufficient for a start...
Is "sacct" just lacking a job field, or is this info indeed dropped and not stored in the DB?
Thanks, Steffen
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
We find this quite useful for debugging purposes.
Best regards Jürgen
* Steffen Grunewald via slurm-users slurm-users@lists.schedmd.com [240807 14:08]:
Hello everyone,
I've grepped the manual pages and crawled the 'net, but couldn't find any answer to the following problem:
I can see that the ctld keeps a record of it below /var/spool/slurm - as long as the job is running or waiting (and shown by "squeue") - and that this stores the environment that contains SLURM_SUBMIT_HOST
- but this information seems to be lost when the job finishes.
Is there a way to find out what the value of SLURM_SUBMIT_HOST was? I'd be interested in a few more env variables, but this one should be sufficient for a start...
Is "sacct" just lacking a job field, or is this info indeed dropped and not stored in the DB?
Thanks, Steffen
-- Steffen Grunewald, Cluster Administrator Max Planck Institute for Gravitational Physics (Albert Einstein Institute) Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
Fon: +49-331-567 7274 Mail: steffen.grunewald(at)aei.mpg.de
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
Warning on that one, it can eat up a ton of database space (depending on size of environment, uniqueness of environment between jobs, and number of jobs). We had it on and it nearly ran us out of space on our database host. That said the data can be really useful depending on the situation.
-Paul Edmon-
On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
We find this quite useful for debugging purposes.
Best regards Jürgen
- Steffen Grunewald via slurm-users slurm-users@lists.schedmd.com [240807 14:08]:
Hello everyone,
I've grepped the manual pages and crawled the 'net, but couldn't find any answer to the following problem:
I can see that the ctld keeps a record of it below /var/spool/slurm - as long as the job is running or waiting (and shown by "squeue") - and that this stores the environment that contains SLURM_SUBMIT_HOST
- but this information seems to be lost when the job finishes.
Is there a way to find out what the value of SLURM_SUBMIT_HOST was? I'd be interested in a few more env variables, but this one should be sufficient for a start...
Is "sacct" just lacking a job field, or is this info indeed dropped and not stored in the DB?
Thanks, Steffen
-- Steffen Grunewald, Cluster Administrator Max Planck Institute for Gravitational Physics (Albert Einstein Institute) Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
Fon: +49-331-567 7274 Mail: steffen.grunewald(at)aei.mpg.de
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
Hi Paul,
agreed. Actually, we even have
AccountingStoreFlags=job_comment,job_script,job_env
set in slurm.conf. Although we didn't observe any issues so far in regular operation with about 10000 jobs per day, this did clearly affect database conversion time during major Slurm version updates.
Therefore, as always, one must weigh the pros and cons. There is no such thing as a free lunch ...
Best regards Jürgen
* Paul Edmon via slurm-users slurm-users@lists.schedmd.com [240807 08:55]:
Warning on that one, it can eat up a ton of database space (depending on size of environment, uniqueness of environment between jobs, and number of jobs). We had it on and it nearly ran us out of space on our database host. That said the data can be really useful depending on the situation.
-Paul Edmon-
On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
We find this quite useful for debugging purposes.
Best regards Jürgen
- Steffen Grunewald via slurm-users slurm-users@lists.schedmd.com [240807 14:08]:
Hello everyone,
I've grepped the manual pages and crawled the 'net, but couldn't find any answer to the following problem:
I can see that the ctld keeps a record of it below /var/spool/slurm - as long as the job is running or waiting (and shown by "squeue") - and that this stores the environment that contains SLURM_SUBMIT_HOST
- but this information seems to be lost when the job finishes.
Is there a way to find out what the value of SLURM_SUBMIT_HOST was? I'd be interested in a few more env variables, but this one should be sufficient for a start...
Is "sacct" just lacking a job field, or is this info indeed dropped and not stored in the DB?
Thanks, Steffen
-- Steffen Grunewald, Cluster Administrator Max Planck Institute for Gravitational Physics (Albert Einstein Institute) Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
Fon: +49-331-567 7274 Mail: steffen.grunewald(at)aei.mpg.de
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
Hello Steffen,
On 8/7/24 2:08 PM, Steffen Grunewald via slurm-users wrote:
Is there a way to find out what the value of SLURM_SUBMIT_HOST was? I'd be interested in a few more env variables, but this one should be sufficient for a start...
What you're looking for might be doable simply by setting the AccountStoreFlags parameter in slurm.conf. [1]
Be aware, though, that job_env has sometimes been reported to grow quite large.
1. https://slurm.schedmd.com/slurm.conf.html#OPT_AccountingStoreFlags
Best,
On Wed, 2024-08-07 at 08:55:21 -0400, Slurm users wrote:
Warning on that one, it can eat up a ton of database space (depending on size of environment, uniqueness of environment between jobs, and number of jobs). We had it on and it nearly ran us out of space on our database host. That said the data can be really useful depending on the situation.
-Paul Edmon-
On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
On Wed, 2024-08-07 at 14:56:30 +0200, Slurm users wrote:
What you're looking for might be doable simply by setting the AccountStoreFlags parameter in slurm.conf. [1]
Be aware, though, that job_env has sometimes been reported to grow quite large.
I see, I cannot have the cake and eat it at the same time. Given the size of our users' typical env, I'm dropping the idea for now - maybe this will come up again in the not-so-far future. (Maybe it's worth a feature request?)
Thanks everyone!
- Steffen
If you need it, you could add it to either prologue or epilogue to store the info somewhere.
I do that for the scripts themselves and keep the past two weeks backed up so we can debug if/when there is an issue.
Brian Andrus
On 8/7/2024 6:29 AM, Steffen Grunewald via slurm-users wrote:
On Wed, 2024-08-07 at 08:55:21 -0400, Slurm users wrote:
Warning on that one, it can eat up a ton of database space (depending on size of environment, uniqueness of environment between jobs, and number of jobs). We had it on and it nearly ran us out of space on our database host. That said the data can be really useful depending on the situation.
-Paul Edmon-
On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
On Wed, 2024-08-07 at 14:56:30 +0200, Slurm users wrote:
What you're looking for might be doable simply by setting the AccountStoreFlags parameter in slurm.conf. [1]
Be aware, though, that job_env has sometimes been reported to grow quite large.
I see, I cannot have the cake and eat it at the same time. Given the size of our users' typical env, I'm dropping the idea for now - maybe this will come up again in the not-so-far future. (Maybe it's worth a feature request?)
Thanks everyone!
- Steffen
Maybe a somewhat 'hacky' idea - couldn't you put a line in the epilog script that logs the interesting entries (i.e. SLURM_SUBMIT_HOST) to some logfile at job completion? Of course that would only be feasible if the amount of completing jobs per time unit isn't super high, otherwise you'd obviously need to watch out for race conditions in concurrent writes, etc.
- René Sitt
Am 07.08.24 um 15:29 schrieb Steffen Grunewald via slurm-users:
On Wed, 2024-08-07 at 08:55:21 -0400, Slurm users wrote:
Warning on that one, it can eat up a ton of database space (depending on size of environment, uniqueness of environment between jobs, and number of jobs). We had it on and it nearly ran us out of space on our database host. That said the data can be really useful depending on the situation.
-Paul Edmon-
On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
On Wed, 2024-08-07 at 14:56:30 +0200, Slurm users wrote:
What you're looking for might be doable simply by setting the AccountStoreFlags parameter in slurm.conf. [1]
Be aware, though, that job_env has sometimes been reported to grow quite large.
I see, I cannot have the cake and eat it at the same time. Given the size of our users' typical env, I'm dropping the idea for now - maybe this will come up again in the not-so-far future. (Maybe it's worth a feature request?)
Thanks everyone!
- Steffen
I think this would be a good feature request. At least to me everything you can get in scontrol show job should be in sacct in some form.
-Paul Edmon-
On 8/7/2024 9:29 AM, Steffen Grunewald wrote:
On Wed, 2024-08-07 at 08:55:21 -0400, Slurm users wrote:
Warning on that one, it can eat up a ton of database space (depending on size of environment, uniqueness of environment between jobs, and number of jobs). We had it on and it nearly ran us out of space on our database host. That said the data can be really useful depending on the situation.
-Paul Edmon-
On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
Hi Steffen,
not sure if this is what you are looking for, but with `AccountingStoreFlags=job_env´ set in slurm.conf, the batch job environment will be stored in the accounting database and can later be retrieved with `sacct -j <jobid> --env-vars´ command.
On Wed, 2024-08-07 at 14:56:30 +0200, Slurm users wrote:
What you're looking for might be doable simply by setting the AccountStoreFlags parameter in slurm.conf. [1]
Be aware, though, that job_env has sometimes been reported to grow quite large.
I see, I cannot have the cake and eat it at the same time. Given the size of our users' typical env, I'm dropping the idea for now - maybe this will come up again in the not-so-far future. (Maybe it's worth a feature request?)
Thanks everyone!
- Steffen