Quick-and-dirty knowledge base for ODU RCS.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

8.2 KiB

SLURM ACCOUNTING (sacct)

Created: April 2019
Updated: November 2019

CAVEAT: This document was originally developed by referencing SLURM 18.08.1 used on Turing. I also tried to consult the newer version (master branch around July 2019). Newer version may introduce additional features, or features incompatible with this version. Please use a grain of salt when reading, and always consult with manual pages, source code, etc in case of doubt.

Update 2019-11-06: SLURM man page now contains the description of the accounting fields. Please look at https://slurm.schedmd.com/sacct.html#lbAF .

UNDERSTANDING SLURM ACCOUNTING FIELDS

SLURM accounting can produce very many fields.

JobID: The "cooked" job ID. Please see the discussion below.

JobIDRaw: The "raw" job ID. In a vast majority of cases, the JobIDRaw field is identical to JobID except in the case of array jobs. Please see the discussion below.

TimelimitRaw: The raw value of time limit, in minutes.

About SLURM Job IDs

SLURM produces one or more records in the accounting database for every job. When a user submits a job to SLURM, SLURM assigns that job a unique job number, like this:

$ sbatch calculation.job
Submitted batch job 8918299

However, internally within SLURM, there can be one or more "job steps" created and executed while this job is being launched and executed. (Things get more even complicated with newer "heterogenous job" feature, in which various parts of a job can require very different resources. See this documentation for more information.) The combination of all the job steps constitute the entire job. Each job step generates its own record in the SLURM accounting database.

Summary on Job ID

A single SLURM job will generate the "master record" which logs the overall execution of the job. In addition, there can be zero or more extra records generated by the "job steps" triggered during the course of that job. The master record includes the resource utilization usage (CPU, memory, etc) of the child "job steps". The master job record is characterized by a plain number in the JobIDRaw field. Further, the User field must not be empty.

The rest of this section goes into greater detail of the various JobID's.

Observed Job ID Patterns

Several regex patterns have observed in the JobID field (from Turing accounting):

  • [0-9]+

  • [0-9]+_[0-9]+ (for job arrays)

  • [0-9]+\.[0-9]+

  • [0-9]+\.batch

For all cases, the JobIDRaw is the same as JobID except in the case of /[0-9]+_[0-9]+/, where the JobIDRaw is a running number [0-9]+. This is the case where the submitter specifies an array of jobs.

From the slurm's sacct source code (src/sacct/print.c) one can find that there are other patterns too (look for string case PRINT_JOBIDRAW:). The key function in print.c is print_fields. In particular look at the lengthy case statement where it tackles PRINT_JOBID and PRINT_JOBIDRAW cases.

A job can be of different types:

  • JOB
  • JOBSTEP
  • JOBCOMP

A JOBSTEP can have several subtypes:

  • SLURM_BATCH_SCRIPT, in which case JobIDRaw will obtain the .batch suffix.
  • SLURM_EXTERN_CONT, in which case JobIDRaw will obtain the .extern suffix. Apparently, this is meant to indicate "external" type of job steps, described further below.
  • many others; but in this case, it will print JobIDRaw in [0-9]+\.[0-9]+ pattern
  • Other types (usually it will have index numbers like 0, 1, 2, ...)

Vanilla Job

A "vanilla" job entry corresponds to a single job submitted by a user to SLURM. This will not be a job array.

  • Regexp match : JobID ~ /^[0-9]+$/.

From my observation, only simple single-core jobs that do not involve any MPI or other fancy stuff (no job array, for example) would not generate extra "child records" for job steps in the SLURM accounting database.

However, several job records with this type JobID will have no "User" field set. These are also not vanilla jobs.

Array Job

An "array" job entry corresponds to a single job as part of a job array submitted by a user to SLURM.

  • Regexp match : JobID ~ /^[0-9]+_[0-9]+$/.

The Job ID contains two numbers separated by an underscore. The number before the underscore refers to the job ID as reported by sbatch upon the submission of the job.

NOTE: Newer version of SLURM will allow textual word instead of numbers to identify one job in an array. Those text-based job label (instead of integer) will be marked by square brackets around the job suffix:

  • Characteristics (textual array label): JobID ~ /^[0-9]+_\[.*\]+$/.

Heterogenous Job

A heterogenous job entry corresponds to a part of a heterogenous job submitted by a user to SLURM.

  • Regexp match: JobID ~ /^[0-9]+\+[0-9]+$/.

The Job ID contains two numbers separated by a plus sign. The number before the underscore refers to the job ID as reported by sbatch upon the submission of the job.

This will not be a job array.

Job Step: Batch script

This corresponds to the execution of the batch script (submitted to sbatch) when more than one CPU cores were requested by the job.

Characteristics of SLURM_BATCH_SCRIPT accounting records:

  • Regexp match: JobIDRaw ~ /^[0-9]+\.batch$/

  • The record does NOT have user ID (field User)

  • JobName is always batch

Job Step: External

SLURM_EXTERN_CONT apparently is a way to account for "external processes". It is still not 100% obvious what this means, but from reading the source code, there are two types of stuff that will fall under this category:

  • Job prologue

  • Direct SSH access into an allocated compute node: in this case, the pam_adopt_slurm module will make the determination as to which SLURM job launches the ssh (if any) and attribute the portion of this computation to the calling job.

There were some other steps observed, whose JobIDRaw becomes NNNNN.N. I wonder if these "job steps" are due to the calls of "srun" within the batch script, because the job names are indicative: pw.x, pmi_proxy, etc.. (Example job: 5947279 , Nov 2018.)

Job Step: All the others

These correspond to job steps that were launched by srun or other similar mechanism instead the job script. A prime example is the mpirun launch, which will record a new job step.

Job Completion

JOBCOMP appears to mark a job completion. Not sure if this kind of record appears on Turing accounting; that may be only when a specific "job completion" task is specified.

Questions & (Possible) Answers

  • Why there is a separate "NNNNN.batch" record? Perhaps, this record was made when the job is multi-node. It appears to me that the ".batch" record is for accounting the batch script itself (which will run only on node #0 of the allocated resources).

The Takeaway

Why all this complicated explanation? My original goal was to find the accounting records which covers the whole-job statistics without getting bogged down by the minute details of each job. This is what I found after this exploration:

We only need to include accounting records where the JobIDRaw field contains only whole integers (i.e. matching regex ^[0-9]+$). Further,

References

SLURM administrator's documentation contains helpful bits and pieces to decipher the accounting records; unfortunately in themselves they are not sufficient.

  • Accounting: https://slurm.schedmd.com/accounting.html .

  • Job Launch design guide: https://slurm.schedmd.com/job_launch.html .

    This guide describes at a high level the processes which occur in order to initiate a job including the daemons and plugins involved in the process. It describes the process of job allocation, step allocation, task launch and job termination.

    In SLURM, launching a job is a multistep process. Various "job steps" described this guide eventually make their own entries in the SLURM accounting database.

Working Notes

These are my private working notes:

  • daily-notes/2019/20190326.slurm-acct.txt
  • daily-notes/2019/20190411.slurm-acct.txt
  • daily-notes/2019/20190430.slurm-acct-201811.txt
  • docs/kb/turing-slurm/20180106.SLURM-accounting.txt