You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
271 lines
8.2 KiB
271 lines
8.2 KiB
SLURM ACCOUNTING (sacct)
|
|
========================
|
|
|
|
Created: April 2019<br>
|
|
Updated: November 2019
|
|
|
|
**CAVEAT:**
|
|
This document was originally developed by referencing SLURM
|
|
18.08.1 used on Turing.
|
|
I also tried to consult the newer version (master branch
|
|
around July 2019).
|
|
Newer version may introduce additional features, or features
|
|
incompatible with this version.
|
|
Please use a grain of salt when reading, and always consult with
|
|
manual pages, source code, etc in case of doubt.
|
|
|
|
*Update 2019-11-06*:
|
|
SLURM man page now contains the description of the accounting fields.
|
|
Please look at
|
|
<https://slurm.schedmd.com/sacct.html#lbAF> .
|
|
|
|
|
|
|
|
UNDERSTANDING SLURM ACCOUNTING FIELDS
|
|
-------------------------------------
|
|
|
|
SLURM accounting can produce very many fields.
|
|
|
|
`JobID`:
|
|
The "cooked" job ID. Please see the discussion below.
|
|
|
|
`JobIDRaw`:
|
|
The "raw" job ID.
|
|
In a vast majority of cases, the `JobIDRaw` field is identical to `JobID`
|
|
except in the case of array jobs.
|
|
Please see the discussion below.
|
|
|
|
`TimelimitRaw`:
|
|
The raw value of time limit, in minutes.
|
|
|
|
|
|
|
|
### About SLURM Job IDs
|
|
|
|
SLURM produces one or more records in the accounting database for every job.
|
|
When a user submits a job to SLURM, SLURM assigns that job a unique job number,
|
|
like this:
|
|
|
|
$ sbatch calculation.job
|
|
Submitted batch job 8918299
|
|
|
|
However, internally within SLURM, there can be one or more "job steps" created
|
|
and executed while this job is being launched and executed.
|
|
(Things get more even complicated with newer "heterogenous job" feature,
|
|
in which various parts of a job can require very different resources.
|
|
See [this documentation](https://slurm.schedmd.com/heterogeneous_jobs.html)
|
|
for more information.)
|
|
The combination of all the job steps constitute the entire job.
|
|
Each job step generates its own record in the SLURM accounting database.
|
|
|
|
#### Summary on Job ID
|
|
|
|
A single SLURM job will generate the "master record" which logs the
|
|
overall execution of the job.
|
|
In addition, there can be zero or more extra records generated by the
|
|
"job steps" triggered during the course of that job.
|
|
The master record includes the resource utilization usage (CPU,
|
|
memory, etc) of the child "job steps".
|
|
The master job record is characterized by a plain number in the
|
|
`JobIDRaw` field.
|
|
Further, the `User` field must not be empty.
|
|
|
|
The rest of this section goes into greater detail of the various
|
|
`JobID`'s.
|
|
|
|
#### Observed Job ID Patterns
|
|
|
|
Several regex patterns have observed in the JobID field (from Turing
|
|
accounting):
|
|
|
|
* `[0-9]+`
|
|
|
|
* `[0-9]+_[0-9]+` (for job arrays)
|
|
|
|
* `[0-9]+\.[0-9]+`
|
|
|
|
* `[0-9]+\.batch`
|
|
|
|
For all cases, the `JobIDRaw` is the same as `JobID` except in the case of
|
|
`/[0-9]+_[0-9]+/`, where the `JobIDRaw` is a running number `[0-9]+`.
|
|
This is the case where the submitter specifies an array of jobs.
|
|
|
|
From the slurm's sacct source code (`src/sacct/print.c`) one can find that there
|
|
are other patterns too (look for string `case PRINT_JOBIDRAW:`).
|
|
The key function in `print.c` is `print_fields`.
|
|
In particular look at the lengthy `case` statement where it tackles
|
|
`PRINT_JOBID` and `PRINT_JOBIDRAW` cases.
|
|
|
|
A job can be of different types:
|
|
|
|
* `JOB`
|
|
* `JOBSTEP`
|
|
* `JOBCOMP`
|
|
|
|
A `JOBSTEP` can have several subtypes:
|
|
|
|
* `SLURM_BATCH_SCRIPT`, in which case JobIDRaw will obtain the `.batch` suffix.
|
|
* `SLURM_EXTERN_CONT`, in which case JobIDRaw will obtain the `.extern` suffix.
|
|
Apparently, this is meant to indicate "external" type of job steps,
|
|
described further below.
|
|
* many others; but in this case, it will print JobIDRaw in `[0-9]+\.[0-9]+`
|
|
pattern
|
|
* Other types (usually it will have index numbers like 0, 1, 2, ...)
|
|
|
|
|
|
#### Vanilla Job
|
|
|
|
A "vanilla" job entry corresponds to a single job submitted by a user to SLURM.
|
|
This will not be a job array.
|
|
|
|
* Regexp match : `JobID ~ /^[0-9]+$/`.
|
|
|
|
From my observation, only simple single-core jobs that do not involve any
|
|
MPI or other fancy stuff (no job array, for example) would not
|
|
generate extra "child records" for job steps in the SLURM accounting
|
|
database.
|
|
|
|
However, several job records with this type JobID will have no "User" field set.
|
|
These are also not vanilla jobs.
|
|
|
|
|
|
#### Array Job
|
|
|
|
An "array" job entry corresponds to a single job as part of a job
|
|
array submitted by a user to SLURM.
|
|
|
|
* Regexp match : `JobID ~ /^[0-9]+_[0-9]+$/`.
|
|
|
|
The Job ID contains two numbers separated by an underscore.
|
|
The number before the underscore refers to the job ID as reported by
|
|
sbatch upon the submission of the job.
|
|
|
|
NOTE: Newer version of SLURM will allow textual word instead of
|
|
numbers to identify one job in an array.
|
|
Those text-based job label (instead of integer) will be marked by
|
|
square brackets around the job suffix:
|
|
|
|
* Characteristics (textual array label): `JobID ~ /^[0-9]+_\[.*\]+$/`.
|
|
|
|
|
|
#### Heterogenous Job
|
|
|
|
A heterogenous job entry corresponds to a part of a heterogenous job
|
|
submitted by a user to SLURM.
|
|
|
|
* Regexp match: `JobID ~ /^[0-9]+\+[0-9]+$/`.
|
|
|
|
The Job ID contains two numbers separated by a plus sign.
|
|
The number before the underscore refers to the job ID as reported by
|
|
sbatch upon the submission of the job.
|
|
|
|
This will not be a job array.
|
|
|
|
|
|
#### Job Step: Batch script
|
|
|
|
This corresponds to the execution of the batch script (submitted to
|
|
sbatch) when more than one CPU cores were requested by the job.
|
|
|
|
Characteristics of SLURM_BATCH_SCRIPT accounting records:
|
|
|
|
* Regexp match: `JobIDRaw ~ /^[0-9]+\.batch$/`
|
|
|
|
* The record does NOT have user ID (field `User`)
|
|
|
|
* `JobName` is always `batch`
|
|
|
|
|
|
#### Job Step: External
|
|
|
|
SLURM_EXTERN_CONT apparently is a way to account for "external processes".
|
|
It is still not 100% obvious what this means, but from reading the
|
|
source code, there are two types of stuff that will fall under this
|
|
category:
|
|
|
|
* Job prologue
|
|
|
|
* Direct SSH access into an allocated compute node: in this case, the
|
|
`pam_adopt_slurm` module will make the determination as to which
|
|
SLURM job launches the ssh (if any) and attribute the portion of
|
|
this computation to the calling job.
|
|
|
|
There were some other steps observed, whose JobIDRaw becomes `NNNNN.N`.
|
|
I wonder if these "job steps" are due to the calls of "srun" within
|
|
the batch script, because the job names are indicative: `pw.x`,
|
|
`pmi_proxy`, etc..
|
|
(Example job: 5947279 , Nov 2018.)
|
|
|
|
|
|
#### Job Step: All the others
|
|
|
|
These correspond to job steps that were launched by `srun` or other
|
|
similar mechanism instead the job script.
|
|
A prime example is the `mpirun` launch, which will record a new job step.
|
|
|
|
|
|
|
|
|
|
#### Job Completion
|
|
|
|
`JOBCOMP` appears to mark a job completion.
|
|
Not sure if this kind of record appears on Turing accounting;
|
|
that may be only when a specific "job completion" task is specified.
|
|
|
|
|
|
#### Questions & (Possible) Answers
|
|
|
|
* Why there is a separate "NNNNN.batch" record?
|
|
Perhaps, this record was made when the job is multi-node.
|
|
It appears to me that the ".batch" record is for accounting the batch script
|
|
itself (which will run only on node #0 of the allocated resources).
|
|
|
|
|
|
#### The Takeaway
|
|
|
|
Why all this complicated explanation?
|
|
My original goal was to find the accounting records which covers the
|
|
whole-job statistics without getting bogged down by the minute details
|
|
of each job.
|
|
This is what I found after this exploration:
|
|
|
|
> We only need to include accounting records where the `JobIDRaw` field
|
|
> contains only whole integers (i.e. matching regex `^[0-9]+$`).
|
|
> Further,
|
|
|
|
|
|
## References
|
|
|
|
- `sacct` manual page:
|
|
<https://slurm.schedmd.com/sacct.html>
|
|
|
|
SLURM administrator's documentation contains helpful bits and pieces
|
|
to decipher the accounting records; unfortunately in themselves they
|
|
are not sufficient.
|
|
|
|
- Accounting:
|
|
<https://slurm.schedmd.com/accounting.html> .
|
|
|
|
- Job Launch design guide:
|
|
<https://slurm.schedmd.com/job_launch.html> .
|
|
|
|
> This guide describes at a high level the processes which occur in
|
|
> order to initiate a job including the daemons and plugins involved
|
|
> in the process. It describes the process of job allocation, step
|
|
> allocation, task launch and job termination.
|
|
|
|
In SLURM, launching a job is a multistep process.
|
|
Various "job steps" described this guide eventually make their own
|
|
entries in the SLURM accounting database.
|
|
|
|
|
|
|
|
### Working Notes
|
|
|
|
These are my private working notes:
|
|
|
|
- daily-notes/2019/20190326.slurm-acct.txt
|
|
- daily-notes/2019/20190411.slurm-acct.txt
|
|
- daily-notes/2019/20190430.slurm-acct-201811.txt
|
|
- docs/kb/turing-slurm/20180106.SLURM-accounting.txt
|
|
|
|
|