<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>You will probably need to.</p>
<p>The way we handle it is that we add users when the first submit a
job via the job_submit.lua script. This way the database
autopopulates with active users.</p>
<p>-Paul Edmon-<br>
</p>
<div class="moz-cite-prefix">On 10/3/23 9:01 AM, Davide DelVento
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAAX1q8aKRtLFSsiRF7fBVEYrdRJFj=1=xB7Ffez_KT0LAQWTng@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>By increasing the slurmdbd verbosity level, I got
additional information, namely the following:</div>
<div><br class="gmail-Apple-interchange-newline">
slurmdbd: error: couldn't get information for this user
(null)(xxxxxx)<br>
</div>
slurmdbd: debug: accounting_storage/as_mysql:
as_mysql_jobacct_process_get_jobs: User
xxxxxx has no associations, and is not admin, so not returning
any jobs.<br>
<div><br>
</div>
<div>again where xxxxx is the posix ID of the user who's running
the query in the slurmdbd logs.<br>
</div>
<div><br>
</div>
<div>I suspect this is due to the fact that our userbase is
small enough (we are a department HPC) that we don't need to
use allocation and the like, so I have not configured any
association (and not even studied its configuration, since
when I was at another place which did use associations,
someone else took care of slurm administration).</div>
<div><br>
</div>
<div>Anyway, I read the fantastic document by our own member at
<a
href="https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations"
moz-do-not-send="true" class="moz-txt-link-freetext">https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations</a>
and in fact I have not even configured slurm users:</div>
<div><br>
</div>
<div># sacctmgr show user<br>
User Def Acct Admin<br>
---------- ---------- ---------<br>
root root Administ+<br>
</div>
<div>#</div>
<div><br>
</div>
<div>So is that the issue? Should I just add all users? Any
suggestions on the minimal (but robust) way to do that?</div>
<div><br>
</div>
<div>Thanks!</div>
<div><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Oct 2, 2023 at 9:20 AM
Davide DelVento <<a href="mailto:davide.quantum@gmail.com"
moz-do-not-send="true" class="moz-txt-link-freetext">davide.quantum@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Thanks Paul, this helps.
<div><br>
<div>I don't have any PrivateData line in either config
file. According to the docs, "By default, all
information is visible to all users" so this should not
be an issue. I tried to add a line with
"PrivateData=jobs" to the conf files, just in case, but
that didn't change the behavior.</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Oct 2, 2023 at
9:10 AM Paul Edmon <<a
href="mailto:pedmon@cfa.harvard.edu" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">pedmon@cfa.harvard.edu</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>At least in our setup, users can see their own
scripts by doing sacct -B -j JOBID</p>
<p>I would make sure that the scripts are being stored
and how you have PrivateData set.</p>
<p>-Paul Edmon-<br>
</p>
<div>On 10/2/2023 10:57 AM, Davide DelVento wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">I deployed the job_script archival and
it is working, however it can be queried only by
root.
<div><br>
</div>
<div>A regular user can run sacct -lj towards any
jobs (even those by other users, and that's okay
in our setup) with no problem. However if they run
sacct -j job_id --batch-script even against a job
they own themselves, nothing is returned and I get
a</div>
<div>
<div><br>
</div>
<div>slurmdbd: error: couldn't get information for
this user (null)(xxxxxx)</div>
<div><br>
</div>
<div>where xxxxx is the posix ID of the user who's
running the query in the slurmdbd logs.</div>
<div><br>
</div>
<div>Both configure files slurmdbd.conf
and slurm.conf do not have any "permission"
setting. FWIW, we use LDAP.</div>
<div><br>
</div>
<div>Is that the expected behavior, in that by
default only root can see the job scripts? I was
assuming the users themselves should be able to
debug their own jobs... Any hint on what could
be changed to achieve this?</div>
<div><br>
</div>
<div>Thanks!<br>
<div><br>
</div>
</div>
<div><br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Sep 29,
2023 at 5:48 AM Davide DelVento <<a
href="mailto:davide.quantum@gmail.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">davide.quantum@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Fantastic, this is really helpful,
thanks!</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Sep
28, 2023 at 12:05 PM Paul Edmon <<a
href="mailto:pedmon@cfa.harvard.edu"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">pedmon@cfa.harvard.edu</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Yes it was later than that. If you are
23.02 you are good. We've been running
with storing job_scripts on for years at
this point and that part of the database
only uses up 8.4G. Our entire database
takes up 29G on disk. So its about 1/3 of
the database. We also have database
compression which helps with the on disk
size. Raw uncompressed our database is
about 90G. We keep 6 months of data in
our active database.<br>
</p>
<p>-Paul Edmon-<br>
</p>
<div>On 9/28/2023 1:57 PM, Ryan Novosielski
wrote:<br>
</div>
<blockquote type="cite"> Sorry for the
duplicate e-mail in a short time: do you
know (or anyone) when the hashing was
added? Was planning to enable this on
21.08, but we then had to delay our
upgrade to it. I’m assuming later than
that, as I believe that’s when the feature
was added.
<div><br>
<blockquote type="cite">
<div>On Sep 28, 2023, at 13:55, Ryan
Novosielski <a
href="mailto:novosirj@rutgers.edu"
target="_blank"
moz-do-not-send="true"><novosirj@rutgers.edu></a>
wrote:</div>
<br>
<div>
<div> Thank you; we’ll put in a
feature request for improvements
in that area, and also thanks for
the warning? I thought of that in
passing, but the real world
experience is really useful. I
could easily see wanting that
stuff to be retained less often
than the main records, which is
what I’d ask for.
<div><br>
</div>
<div>I assume that archiving, in
general, would also remove this
stuff, since old jobs themselves
will be removed?</div>
<div><br>
<div>
<div>
<div dir="auto"
style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<div dir="auto"
style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<div dir="auto"
style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<div dir="auto"
style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<div
style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
--<br>
#BlackLivesMatter</div>
<div
style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
____<br>
|| \\UTGERS,
|---------------------------*O*---------------------------<br>
||_// the State<span
style="white-space:pre-wrap"> </span> | Ryan Novosielski - <a
href="mailto:novosirj@rutgers.edu" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">novosirj@rutgers.edu</a><br>
|| \\ University |
Sr. Technologist
- 973/972.0922
(2x0922) ~*~
RBHS Campus<br>
|| \\ of NJ<span
style="white-space:pre-wrap"> </span> | Office of Advanced Research
Computing - MSB
A555B, Newark<br>
`'</div>
</div>
</div>
</div>
</div>
</div>
<div><br>
<blockquote type="cite">
<div>On Sep 28, 2023, at
13:48, Paul Edmon <a
href="mailto:pedmon@cfa.harvard.edu" target="_blank"
moz-do-not-send="true"><pedmon@cfa.harvard.edu></a>
wrote:</div>
<br>
<div>
<div>Slurm should take
care of it when you
add it.<br>
<br>
So far as horror
stories, under
previous versions our
database size
ballooned to be so
massive that it
actually prevented us
from upgrading and we
had to drop the
columns containing the
job_script and
job_env. This was
back before slurm
started hashing the
scripts so that it
would only store one
copy of duplicate
scripts. After this
point we found that
the job_script
database stayed at a
fairly reasonable size
as most users use
functionally the same
script each time.
However the job_env
continued to grow like
crazy as there are
variables in our
environment that
change fairly
consistently depending
on where the user is.
Thus job_envs ended up
being too massive to
keep around and so we
had to drop them.
Frankly we never
really used them for
debugging. The
job_scripts though are
super useful and not
that much overhead.<br>
<br>
In summary my
recommendation is to
only store
job_scripts. job_envs
add too much storage
for little gain,
unless your job_envs
are basically the same
for each user in each
location.<br>
<br>
Also it should be
noted that there is no
way to prune out
job_scripts or
job_envs right now. So
the only way to get
rid of them if they
get large is to 0 out
the column in the
table. You can ask
SchedMD for the mysql
command to do this as
we had to do it here
to our job_envs.<br>
<br>
-Paul Edmon-<br>
<br>
On 9/28/2023 1:40 PM,
Davide DelVento wrote:<br>
<blockquote
type="cite">In my
current slurm
installation,
(recently upgraded
to slurm v23.02.3),
I only have<br>
<br>
AccountingStoreFlags=job_comment<br>
<br>
I now intend to add
both<br>
<br>
AccountingStoreFlags=job_script<br>
AccountingStoreFlags=job_env<br>
<br>
leaving the default
4MB value
for max_script_size<br>
<br>
Do I need to do
anything on the DB
myself, or will
slurm take care of
the additional
tables if needed?<br>
<br>
Any
comments/suggestions/gotcha/pitfalls/horror_stories
to share? I know
about the additional
diskspace and
potentially load
needed, and with our
resources and
typical workload I
should be okay with
that.<br>
<br>
Thanks!<br>
</blockquote>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</body>
</html>