Arrays

A job array is a collection of jobs that differ from each other by only a single index parameter. Creating a job array provides an easy way to group related jobs together. For example, if you have a parameter study that requires you to run your application five times, each with a different input parameter, you can use a job array instead of creating five separate MSUB scripts and submitting them separately.

Creating a Job Array

The syntax for submitting job arrays is: msub -t [<jobname>]<indexlist>[%<limit>] arrayscript.sh. The <jobname> and <limit> are optional.

To create a job array, use a single MSUB script and use the -t flag to specify a range for the index parameter, either on the msub command line or within your MSUB script. For example, if you submit the following script, a job array with five sub-jobs will be created:

#MSUB -l nodes=4:ppn=20:ib
#MSUB -l walltime=8:00:00
#MSUB -t 1-5

# The array index range above can start at any value, including 0

# For each sub-job, a value provided by the -t
# option is used for MOAB_JOBARRAYINDEX

mkdir dir.$MOAB_JOBARRAYINDEX
cd    dir.$MOAB_JOBARRAYINDEX

mpiexec ../a.out < ../input.$MOAB_JOBARRAYINDEX

Submitting the script to MOAB will return the parent MOAB_JOBID.

% msub job_array_script.sh

516540

Each sub-job in this job array will have a MOAB_JOBID that includes both the parent MOAB_JOBID and a unique MOAB_JOBARRAYINDEX value within the brackets.

516540[1]
516540[2]
516540[3]
516540[4]
516540[5]

To specify that only a certain number of sub-jobs in the array can run at a time, use the percent sign (%) delimiter. In this example, only five sub-jobs in the array can run at a time.

% msub -t myarray[1-1000]%5

To submit a specific set of array sub-jobs, use the comma delimiter in the array index list.

% msub -t myarray[1,2,3,4]
% msub -t myarray[1-5,7,10]

To submit a job with a step size, use a colon (:) in the array range and specify how many jobs to step. In the example below, a step size of 2 is requested. The sub-jobs will be numbered according to the step size inside the index limit.

% msub -t myarray[2-10:2] job.sh

Checking Status

Since the KU Community Cluster utilizes MOAB job arrays, it is best to use the command checkjob to view the status of the job instead of qstat.

The status of the sub-jobs is not displayed by default. For example, the following checkjob command shows the job array summary:

% checkjob 516540

job 516540

AName: job_array_script
Job Array Info:
  Name: 516540

  Sub-jobs:           5
    Active:           5 ( 100.0% )
    Eligible:         0 ( 0.0% )
    Blocked:          0 ( 0.0% )
    Completed:        0 ( 0.0% )

To check the status of the sub-jobs, use the parameter -v along with checkjob.

% checkjob -v 516540

job 516540

AName: job_array_script
Job Array Info:
  Name: 516540
  1 : 516540[1] : Running
  2 : 516540[2] : Running
  3 : 516540[3] : Running
  4 : 516540[4] : Running
  5 : 516540[5] : Running

  Sub-jobs:           5
    Active:           5 ( 100.0% )
    Eligible:         0 ( 0.0% )
    Blocked:          0 ( 0.0% )
    Completed:        0 ( 0.0% )

You can also check the sub-jobs individually by using the MOAB_JOBID of the array sub-job.

% checkjob 516540[1]

Deleting a Job Array or Sub-Job

To delete a job array or a sub-job, use the canceljob command and specify the array or sub-job.

% canceljob 516540

% canceljob 516540[1]

Example Job Array

You have a file with a list of paths to different files that you wish perform the same action on. You could submit a job that loops through the file on 1 node and does said action, or you could submit an array job with an index range with however many lines the file may contain. For this example, our file contains 1000 lines.

#MSUB -l nodes=1:ppn=20
#MSUB -l walltime=6:00:00
#MSUB -q sixhour
#MSUB -t [1-1000]

LINE=$(sed -n "$MOAB_JOBARRAYINDEX"p File.txt)
echo $LINE

call-program-name-here $LINE

Now say your file contains a list of paths that you need to do action on line 1 and 2, then lines 3 and 4, then lines 5 and 6, and so on.

#MSUB -l nodes=1:ppn=20
#MSUB -l walltime=6:00:00
#MSUB -q sixhour
#MSUB -t [1-500]

# Job array size will be half of the number of lines in the file

SECOND=$((MOAB_JOBARRAYINDEX*2))
FIRST="$(($SECOND - 1))"

LINE1=$(sed "$FIRST"p File.txt)
LINE2=$(sed "$SECOND"p File.txt)

call-program-name-here $LINE1 $LINE2

Now you have a file with 30,000 lines and wish to do work on each line. The program you are calling only takes 30 seconds or so, so why not, instead of having 1 node only do 1 line, then take the next job, have that 1 node loop through 100 lines of the file. Now instead of having 30,000 jobs doing only 1 line, you have 300 jobs doing 100 lines.

#MSUB -l nodes=1:ppn=20
#MSUB -l walltime=6:00:00
#MSUB -q sixhour
#MSUB -t [1-300]

# Job array size will be number of lines in file divided by 
# number of lines chosen below

START=$MOAB_JOBARRAYINDEX
NUMLINES=100
STOP=$((MOAB_JOBARRAYINDEX*NUMLINES))
START="$(($STOP - $(($NUMLINES - 1))))"

echo "START=$START"
echo "STOP=$STOP"

for (( N = $START; N <= $STOP; N++ ))
do
    LINE=$(sed "$N"p File.txt)
    call-program-name-here $LINE
done


Vendor Documentation on Job Arrays


CRC Help

If you need any help with the cluster or have general questions related to the cluster, please contact crchelp@ku.edu.

In your email, please include your submission script, any relevant log files, and steps in which you took to produce the problem

One of 34 U.S. public institutions in the prestigious Association of American Universities
44 nationally ranked graduate programs.
—U.S. News & World Report
Top 50 nationwide for size of library collection.
—ALA
23rd nationwide for service to veterans —"Best for Vets," Military Times
KU Today