User Tools

Site Tools


guides:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
guides:slurm [17.04.2020 12:13]
Juha Kekäläinen
guides:slurm [15.11.2021 16:19] (current)
Administrator
Line 2: Line 2:
  
 [Slurm Workload Manager](https://slurm.schedmd.com/overview.html) is an open source [Job scheduler](https://en.wikipedia.org/wiki/Job_scheduler) that is intended to control background executed programs. These background executed programs are called **Jobs**. The user specifies the Job with various parameters that include run time, number of tasks, number or required CPU cores, amount of required memory (RAM) and specify which program(s) to execute. These jobs are called batch jobs. (Batch) Jobs are submitted to a common job queue (partition) that is shared by the other users and Slurm will execute the submitted jobs automatically in turn. After the job is completed (or an error occurs) Slurm can optionally notify the user with an email notification. Additionally to the batch jobs the user can reserve a compute node for interactive jobs where you wait for your turn in a queue and on your turn you are put on your reserved node where you can execute commands. After the reserved time is over your sessions is terminated. [Slurm Workload Manager](https://slurm.schedmd.com/overview.html) is an open source [Job scheduler](https://en.wikipedia.org/wiki/Job_scheduler) that is intended to control background executed programs. These background executed programs are called **Jobs**. The user specifies the Job with various parameters that include run time, number of tasks, number or required CPU cores, amount of required memory (RAM) and specify which program(s) to execute. These jobs are called batch jobs. (Batch) Jobs are submitted to a common job queue (partition) that is shared by the other users and Slurm will execute the submitted jobs automatically in turn. After the job is completed (or an error occurs) Slurm can optionally notify the user with an email notification. Additionally to the batch jobs the user can reserve a compute node for interactive jobs where you wait for your turn in a queue and on your turn you are put on your reserved node where you can execute commands. After the reserved time is over your sessions is terminated.
 +
 +{{:guides:slurm:slurm.png}}
  
 ## Slurm Partitions on sampo.uef.fi ## Slurm Partitions on sampo.uef.fi
  
-- **serial**. 4 out of nodes. Maximum run time 3 days +- **serial**. 4 out of nodes. Maximum run time 3 days 
-- **longrun**. 2 out of nodes. Maximum run time 14 days +- **longrun**. 2 out of nodes. Maximum run time 14 days 
-- **parallel**. 2 of nodes. Maximum run time 3 days. +- **parallel**. 2 of nodes. Maximum run time 3 days. 
-- **gpu**. 1 of 1 nodes. Maximum run time 3 days.+- **gpu**. nodes. Maximum run time 3 days.
  
 ## Explanation of the partitions ## Explanation of the partitions
Line 19: Line 21:
  
 **Parallel** partition is for parallel jobs that can span over multiple nodes (MPI jobs for example). The user can reserve 2 nodes (minimum and maximum). Default run time is 5 minutes and maximum 3 days. **Parallel** partition is for parallel jobs that can span over multiple nodes (MPI jobs for example). The user can reserve 2 nodes (minimum and maximum). Default run time is 5 minutes and maximum 3 days.
 +
 +**GPU** partition is for gpu jobs (CUDA jobs). The user can reserve 2 nodes with 8xNVIDIA A100/40 GB. Default run time is 5 minutes and maximum 3 days.
 +
  
guides/slurm.1587114801.txt.gz · Last modified: 17.04.2020 12:13 by Juha Kekäläinen