Hi Davide,
Did you already check out what the slurmacct script can do for you? See
https://github.com/OleHolmNielsen/Slurm_tools/blob/master/slurmacct/slurmacct
What you're asking for seems like a pretty heavy task regarding system
resources and Slurm database requests. You don't imagine this to run
every time a user makes a login shell? Some users might run "bash -l"
inside jobs to emulate a login session, causing a heavy load on your servers.
/Ole
On 8/21/24 01:13, Davide DelVento via slurm-users wrote:
> Thanks Kevin and Simon,
>
> The full thing that you do is indeed overkill, however I was able to learn
> how to collect/parse some of the information I need.
>
> What I am still unable to get is:
>
> - utilization by queue (or list of node names), to track actual use of
> expensive resources such as GPUs, high memory nodes, etc
> - statistics about wait-in-queue for jobs, due to unavailable resources
>
> hopefully both in a sreport-like format by user and by overall system
>
> I suspect this information is available in sacct, but needs some
> massaging/consolidation to become useful for what I am looking for.
> Perhaps either (or both) of your scripts already do that in some place
> that I did not find? That would be terrific, and I'd appreciate it if you
> can point me to its place.
>
> Thanks again!
>
> On Tue, Aug 20, 2024 at 9:09 AM Kevin Broch via slurm-users
> <slurm-users@lists.schedmd.com <mailto:slurm-users@lists.schedmd.com>> wrote:
>
> Heavyweight solution (although if you have grafana and prometheus
> going already a little less so):
> https://github.com/rivosinc/prometheus-slurm-exporter
> <https://github.com/rivosinc/prometheus-slurm-exporter>
>
> On Tue, Aug 20, 2024 at 12:40 AM Simon Andrews via slurm-users
> <slurm-users@lists.schedmd.com <mailto:slurm-users@lists.schedmd.com>>
> wrote:
>
> Possibly a bit more elaborate than you want but I wrote a web
> based monitoring system for our cluster. It mostly uses standard
> slurm commands for job monitoring, but I've also added storage
> monitoring which requires a separate cron job to run every night.
> It was written for our cluster, but probably wouldn't take much
> work to adapt to another cluster with similar structure.
>
> You can see the code and some screenshots at:
>
> https://github.com/s-andrews/capstone_monitor
> <https://github.com/s-andrews/capstone_monitor>
>
> ..and there's a video walk through at:
>
> https://vimeo.com/982985174 <https://vimeo.com/982985174>
>
> We've also got more friendly scripts for monitoring current and
> past jobs on the command line. These are in a private repository
> as some of the other information there is more sensitive but I'm
> happy to share those scripts. You can see the scripts being used
> in https://vimeo.com/982986202 <https://vimeo.com/982986202>
>
> Simon.
>
> -----Original Message-----
> From: Paul Edmon via slurm-users <slurm-users@lists.schedmd.com
> <mailto:slurm-users@lists.schedmd.com>>
> Sent: 09 August 2024 16:12
> To: slurm-users@lists.schedmd.com
> <mailto:slurm-users@lists.schedmd.com>
> Subject: [slurm-users] Print Slurm Stats on Login
>
> We are working to make our users more aware of their usage. One of
> the ideas we came up with was to having some basic usage stats
> printed at login (usage over past day, fairshare, job efficiency,
> etc). Does anyone have any scripts or methods that they use to do
> this? Before baking my own I was curious what other sites do and
> if they would be willing to share their scripts and methodology.
>
> -Paul Edmon-
>
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> <mailto:slurm-users@lists.schedmd.com> To unsubscribe send an
> email to slurm-users-leave@lists.schedmd.com
> <mailto:slurm-users-leave@lists.schedmd.com>
>
> ------------------------------------
> This email has been scanned for spam & viruses. If you believe
> this email should have been stopped by our filters, click the
> following link to report it
> (https://portal-uk.mailanyone.net/index.html#/outer/reportspam?token=dXNlcj1zaW1vbi5hbmRyZXdzQGJhYnJhaGFtLmFjLnVrO3RzPTE3MjMyMTY5MzA7dXVpZD02NkI2MzQyMTY5MzU2Q0YwRThDQzI5RTY4MkMxOEY5Mjt0b2tlbj01MjI1ZmJmYzJjODgzNWM3ZDE2ZGRiOTE2ZjIxYzk4MjliMjY2MjA0Ow%3D%3D <https://portal-uk.mailanyone.net/index.html#/outer/reportspam?token=dXNlcj1zaW1vbi5hbmRyZXdzQGJhYnJhaGFtLmFjLnVrO3RzPTE3MjMyMTY5MzA7dXVpZD02NkI2MzQyMTY5MzU2Q0YwRThDQzI5RTY4MkMxOEY5Mjt0b2tlbj01MjI1ZmJmYzJjODgzNWM3ZDE2ZGRiOTE2ZjIxYzk4MjliMjY2MjA0Ow%3D%3D>).
>
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com