commit
0d8b081ac7
1 changed files with 232 additions and 0 deletions
@ -0,0 +1,232 @@ |
|||||||
|
SLURM ACCOUNTING (sacct) |
||||||
|
======================== |
||||||
|
|
||||||
|
CAVEAT: |
||||||
|
This document was originally developed by referencing SLURM |
||||||
|
18.08.1 used on Turing. |
||||||
|
I also tried to consult the newer version (master branch |
||||||
|
around July 2019). |
||||||
|
Newer version may introduce additional features, or features |
||||||
|
incompatible with this version. |
||||||
|
Please use a grain of salt when reading, and always consult with |
||||||
|
manual pages, source code, etc in case of doubt. |
||||||
|
|
||||||
|
|
||||||
|
UNDERSTANDING SLURM ACCOUNTING FIELDS |
||||||
|
------------------------------------- |
||||||
|
|
||||||
|
SLURM accounting can produce very many fields. |
||||||
|
|
||||||
|
`JobID`: |
||||||
|
The "cooked" job ID. Please see the discussion below. |
||||||
|
|
||||||
|
`JobIDRaw`: |
||||||
|
The "raw" job ID. Please see the discussion below. |
||||||
|
|
||||||
|
`TimelimitRaw`: |
||||||
|
The raw value of time limit, in minutes. |
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### About SLURM Job IDs |
||||||
|
|
||||||
|
SLURM produces one or more records in the accounting database for every job. |
||||||
|
When a user submits a job to SLURM, SLURM assigns that job a unique job number, |
||||||
|
like this: |
||||||
|
|
||||||
|
$ sbatch calculation.job |
||||||
|
Submitted batch job 8918299 |
||||||
|
|
||||||
|
However, internally within SLURM, there can be one or more "job steps" created |
||||||
|
and executed while this job is being launched and executed. |
||||||
|
(Things get more even complicated with newer "heterogenous job" feature, |
||||||
|
in which various parts of a job can require very different resources. |
||||||
|
See [this documentation](https://slurm.schedmd.com/heterogeneous_jobs.html) |
||||||
|
for more information.) |
||||||
|
|
||||||
|
Several regex patterns have observed in the JobID field (from Turing |
||||||
|
accounting): |
||||||
|
|
||||||
|
* `[0-9]+` |
||||||
|
|
||||||
|
* `[0-9]+_[0-9]+` (for job arrays) |
||||||
|
|
||||||
|
* `[0-9]+\.[0-9]+` |
||||||
|
|
||||||
|
* `[0-9]+\.batch` |
||||||
|
|
||||||
|
For all cases, the `JobIDRaw` is the same as `JobID` except in the case of |
||||||
|
`/[0-9]+_[0-9]+/`, where the `JobIDRaw` is a running number `[0-9]+`. |
||||||
|
This is the case where the submitter specifies an array of jobs. |
||||||
|
|
||||||
|
From the slurm's sacct source code (`src/sacct/print.c`) one can find that there |
||||||
|
are other patterns too (look for string `case PRINT_JOBIDRAW:`). |
||||||
|
The key function in `print.c` is `print_fields`. |
||||||
|
In particular look at the lengthy `case` statement where it tackles |
||||||
|
`PRINT_JOBID` and `PRINT_JOBIDRAW` cases. |
||||||
|
|
||||||
|
A job can be of different types: |
||||||
|
|
||||||
|
* `JOB` |
||||||
|
* `JOBSTEP` |
||||||
|
* `JOBCOMP` |
||||||
|
|
||||||
|
A `JOBSTEP` can have several subtypes: |
||||||
|
|
||||||
|
* `SLURM_BATCH_SCRIPT`, in which case JobIDRaw will obtain the `.batch` suffix. |
||||||
|
* `SLURM_EXTERN_CONT`, in which case JobIDRaw will obtain the `.extern` suffix. |
||||||
|
Apparently, this is meant to indicate "external" type of job steps, |
||||||
|
including. |
||||||
|
* many others; but in this case, it will print JobIDRaw in `[0-9]+\.[0-9]+` |
||||||
|
pattern |
||||||
|
* Other types (usually it will have index numbers like 0, 1, 2, ...) |
||||||
|
|
||||||
|
|
||||||
|
#### Vanilla Job |
||||||
|
|
||||||
|
A "vanilla" job entry corresponds to a single job submitted by a user to SLURM. |
||||||
|
This will not be a job array. |
||||||
|
|
||||||
|
* Characteristics : `JobID ~ /^[0-9]+$/`. |
||||||
|
|
||||||
|
|
||||||
|
#### Array Job |
||||||
|
|
||||||
|
An "array" job entry corresponds to a single job as part of a job |
||||||
|
array submitted by a user to SLURM. |
||||||
|
|
||||||
|
* Characteristics : `JobID ~ /^[0-9]+_[0-9]+$/`. |
||||||
|
|
||||||
|
The Job ID contains two numbers separated by an underscore. |
||||||
|
The number before the underscore refers to the job ID as reported by |
||||||
|
sbatch upon the submission of the job. |
||||||
|
|
||||||
|
NOTE: Newer version of SLURM will allow textual word instead of |
||||||
|
numbers to identify one job in an array. |
||||||
|
Those text-based job label (instead of integer) will be marked by |
||||||
|
square brackets around the job suffix: |
||||||
|
|
||||||
|
* Characteristics (textual array label): `JobID ~ /^[0-9]+_\[.*\]+$/`. |
||||||
|
|
||||||
|
|
||||||
|
#### Heterogenous Job |
||||||
|
|
||||||
|
A heterogenous job entry corresponds to a part of a heterogenous job |
||||||
|
submitted by a user to SLURM. |
||||||
|
|
||||||
|
* Characteristics : `JobID ~ /^[0-9]+\+[0-9]+$/`. |
||||||
|
|
||||||
|
The Job ID contains two numbers separated by a plus sign. |
||||||
|
The number before the underscore refers to the job ID as reported by |
||||||
|
sbatch upon the submission of the job. |
||||||
|
|
||||||
|
This will not be a job array. |
||||||
|
|
||||||
|
|
||||||
|
#### Job Step: Batch script |
||||||
|
|
||||||
|
This corresponds to the execution of the batch script (submitted to |
||||||
|
sbatch) when more than one CPU cores were requested by the job. |
||||||
|
|
||||||
|
Characteristics of SLURM_BATCH_SCRIPT accounting records: |
||||||
|
|
||||||
|
* JobIDRaw =~ /^[0-9]+\.batch$/ |
||||||
|
|
||||||
|
* The record does NOT have user ID (field `User`) |
||||||
|
|
||||||
|
* `JobName` is always `batch` |
||||||
|
|
||||||
|
|
||||||
|
#### Job Step: External |
||||||
|
|
||||||
|
SLURM_EXTERN_CONT apparently is a way to account for "external processes". |
||||||
|
It is still not 100% obvious what this means, but from reading the |
||||||
|
source code, there are two types of stuff that will fall under this |
||||||
|
category: |
||||||
|
|
||||||
|
* Job prologue |
||||||
|
|
||||||
|
* Direct SSH access into an allocated compute node: in this case, the |
||||||
|
`pam_adopt_slurm` module will make the determination as to which |
||||||
|
SLURM job launches the ssh (if any) and attribute the portion of |
||||||
|
this computation to the calling job. |
||||||
|
|
||||||
|
There were some other steps observed, whose JobIDRaw becomes `NNNNN.N`. |
||||||
|
I wonder if these "job steps" are due to the calls of "srun" within |
||||||
|
the batch script, because the job names are indicative: `pw.x`, |
||||||
|
`pmi_proxy`, etc.. |
||||||
|
(Example job: 5947279 , Nov 2018.) |
||||||
|
|
||||||
|
|
||||||
|
#### Job Step: All the others |
||||||
|
|
||||||
|
These correspond to job steps that were launched by `srun` or other |
||||||
|
similar mechanism instead the job script. |
||||||
|
A prime example is the `mpirun` launch, which will record a new job step. |
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Job Completion |
||||||
|
|
||||||
|
`JOBCOMP` appears to mark a job completion. |
||||||
|
Not sure if this kind of record appears on Turing accounting; |
||||||
|
that may be only when a specific "job completion" task is specified. |
||||||
|
|
||||||
|
|
||||||
|
#### Questions & (Possible) Answers |
||||||
|
|
||||||
|
* Why there is a separate "NNNNN.batch" record? |
||||||
|
It is perhaps when the job is multi-node. |
||||||
|
It appears to me that the ".batch" record is for accounting the batch script |
||||||
|
itself (which will run only on node #0 of the allocated resources). |
||||||
|
|
||||||
|
|
||||||
|
#### The Takeaway |
||||||
|
|
||||||
|
Why all this complicated explanation? |
||||||
|
My original goal was to find the accounting records which covers the |
||||||
|
whole-job statistics without getting bogged down by the minute details |
||||||
|
of each job. |
||||||
|
This is what I found after this exploration: |
||||||
|
|
||||||
|
> We only need to include accounting records where the `JobIDRaw` field |
||||||
|
> contains only whole integers (i.e. matching regex `^[0-9]+$`). |
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## References |
||||||
|
|
||||||
|
- `sacct` manual page: |
||||||
|
<https://slurm.schedmd.com/sacct.html> |
||||||
|
|
||||||
|
SLURM administrator's documentation contains helpful bits and pieces |
||||||
|
to decipher the accounting records; unfortunately in themselves they |
||||||
|
are not sufficient. |
||||||
|
|
||||||
|
- Accounting: |
||||||
|
<https://slurm.schedmd.com/accounting.html> . |
||||||
|
|
||||||
|
- Job Launch design guide: |
||||||
|
<https://slurm.schedmd.com/job_launch.html> . |
||||||
|
|
||||||
|
> This guide describes at a high level the processes which occur in |
||||||
|
> order to initiate a job including the daemons and plugins involved |
||||||
|
> in the process. It describes the process of job allocation, step |
||||||
|
> allocation, task launch and job termination. |
||||||
|
|
||||||
|
In SLURM, launching a job is a multistep process. |
||||||
|
Various "job steps" described this guide eventually make their own |
||||||
|
entries in the SLURM accounting database. |
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### Working Notes |
||||||
|
|
||||||
|
These are my private working notes: |
||||||
|
|
||||||
|
- daily-notes/2019/20190326.slurm-acct.txt |
||||||
|
- daily-notes/2019/20190411.slurm-acct.txt |
||||||
|
- daily-notes/2019/20190430.slurm-acct-201811.txt |
||||||
|
- docs/kb/turing-slurm/20180106.SLURM-accounting.txt |
||||||
|
|
Loading…
Reference in new issue