User Tools

Site Tools


guides:slurm:array-job

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
guides:slurm:array-job [05.03.2020 16:07]
Juha Kekäläinen
guides:slurm:array-job [18.11.2021 11:19] (current)
Administrator
Line 1: Line 1:
 # SLURM Job Array # SLURM Job Array
-[SLURM job array](https://slurm.schedmd.com/job_array.html) provides a way to submit multiple similar but independent computational jobs to large number of dataset in concurrent manner. Array jobs and required resources are defined in the same master batch-script and the SLURM will control the workflow by automatically submitting array tasks that are based on single submitted master job.+[SLURM job array](https://slurm.schedmd.com/job_array.html) provides a way to submit multiple similar but independent computational jobs with large number of dataset in concurrent manner. Array jobs and resources are defined in the same master batch-script and the SLURM will control the workflow by automatically submitting array jobs that are based on single submitted master job.
  
-Below we have an example of fastqc quality analysis for multiple fastq nucleotide files. It scans for the "data" directory for any fastq files and executes fastqc run on each file. Results are stored to common directory. SLURM creates array of maximum of 100 subjobs, distributes the subjobs to computing nodes and executes 3 computing job concurrently.+Below we have an example of [fastqc](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/quality analysis for multiple fastq nucleotide files. It scans for the "data" directory for any fastq files and executes fastqc run on each file. Results are stored to common directory. SLURM creates array of maximum of 100 subjobs, distributes the subjobs to computing nodes and executes 3 computing job concurrently.
  
 Each job has 5 minutes of runtime and the job has 500 MB of RAM reserved. Each job has 5 minutes of runtime and the job has 500 MB of RAM reserved.
Line 15: Line 15:
 #SBATCH --output fastq_%A_%a.out.out            # Standard output goes to here #SBATCH --output fastq_%A_%a.out.out            # Standard output goes to here
 #SBATCH --error fastqc_%A_%a.err                # Standard error goes to here #SBATCH --error fastqc_%A_%a.err                # Standard error goes to here
-##SBATCH --mail-user username@uef.fi            # this is the email you wish to be notified at +#SBATCH --mail-user username@uef.fi            # this is the email you wish to be notified at 
-##SBATCH --mail-type ALL                        # ALL will alert you of job beginning, completion, failure etc+#SBATCH --mail-type ALL                        # ALL will alert you of job beginning, completion, failure etc
 #SBATCH --array=1-100%3                         # Array range and number of simultanous jobs #SBATCH --array=1-100%3                         # Array range and number of simultanous jobs
  
-# Make sure that the results exists+# Make sure that the results directory exists
 mkdir -p ./results/ mkdir -p ./results/
  
Line 28: Line 28:
 file=$(ls ./data/*.fastq | sed -n ${SLURM_ARRAY_TASK_ID}p) file=$(ls ./data/*.fastq | sed -n ${SLURM_ARRAY_TASK_ID}p)
  
-# Run quality analysis+# Run quality analysis on each fastq file
 fastqc -o ./results/ $file fastqc -o ./results/ $file
 ``` ```
  
 Submit the job to computing queue with **sbatch** command. Submit the job to computing queue with **sbatch** command.
guides/slurm/array-job.1583417236.txt.gz · Last modified: 05.03.2020 16:07 by Juha Kekäläinen