User Tools

Site Tools


infrastructure:sampo

sampo.uef.fi

Sampo.uef.fi is High Performance Computing (HPC) environment running with the Slurm workload manager. It was launched in autumn, 2019 and is targeted at a wide range of workloads.

Note: The login node (sampo.uef.fi) can be used for light pre- and postprocessing, compiling applications and moving data. All other tasks are to be done using the batch job system.

Specs

In addition to the login node (samplo.uef.fi) the cluster has a total of 4 computing nodes. Each node is equipped with two Intel Xeon Gold processors, code name Skylake, with 40 cores each running at 2,4 GHz (max turbo frequency 3.7GHz). The interconnect is based on Intel Omnipath. The nodes are connected with a 100 Gbps link. The login node act also as NFS file server having 80TB (HDD) storage space.

Login node

  • Model: Dell R740XD
  • CPU: 2 x Intel Xeon Gold 6130 (32 Cores/64 Threads)
  • Memory: 376 GB
  • address: sampo.uef.fi

Compute nodes

  • 4 x Dell C6420
  • CPU: 2 x Intel Xeon Gold 6148 (40 Cores / 80 Threads)
  • Memory:
    • 3 Nodes 376 GB
    • 1 Nodes 768 GB

Paths

Additionally to the UEF IT Services Research Storage the cluster has its own local storage. Research storage is only connected to the login node and it cannot be accessed from the compute nodes.

Therefore the scripts and the data sets must be copied to the cluster local storage if the user wishes to analyze them. There are no backups of the local storage so keep your important data on UEF IT Services Research Storage Space. Also all old files (older than 2 months) will be automatically removed from group folders.

Cluster local storage

  • /home/users/username - 250 GB User home directory ($HOME)
  • /home/groups/groupname - Minimum 5 TB User research group folder

UEF IT Research Storage

  • /research/users/user - User home directory at \\research.uefad.uef.fi
  • /research/work/user - User work directory at \\research.uefad.uef.fi
  • /research/groups/groupname - User research group directory at \\research.uefad.uef.fi

Applications

 bamtools/2.5.1          homer/4.10                   python/3.7.3
 bamutil/1.0.13          htslib/1.9                   r/3.5.3
 bowtie/1.2.3            matlab/R2015b                r/3.6.1
 bowtie2/2.3.4.1         matlab/R2018b                samtools/1.9
 bwa/0.7.17              openjdk/1.8.0_202-b08        sra-toolkit/2.9.2
 clustalw/2.1            openjdk/11.0.2               star/2.6.1b
 cufflinks/2.2.1         openmpi/1.10.7-2             stringtie/1.3.4a
 diamond/0.9.21          openmpi/3.1.4                tabix/2013-12-16
 fastqc/0.11.7           picard/2.18.3                trimmomatic/0.36
 fastx-toolkit/0.0.14    plink/1.07                   vcftools/0.1.14
 freebayes/1.1.0         python/2.7.15
 hisat2/2.1.0            python/3.7.0

Slurm Workload Manager

SlurmWorkload Manager is an open source Job scheduler that is intended to control background executed programs. These background executed programs are called Jobs. User defines the Job with various parameters that include run time, number of tasks (CPU cores), amount of required memory (RAM) and specify which program(s) to execute. These jobs are called batch jobs. (Batch) Jobs are submitted to common job queue (partition) that is shared by the other users and Slurm will execute the submitted jobs automatically in turn. After the job is completed (or error occurs) Slurm can optionally notify the user with email notification. Additionally to the batch jobs user can reserve compute node for interactive jobs where you wait for your turn in queue and on your turn you are put on your reserved node where you can execute commands. After the reserved time is over your sessions is terminated.

Slurm Partitions

  • serial. 4 out of 4 nodes. Maximum run time 3 days
  • longrun. 2 out of 4 nodes. Maximum run time 14 days
  • parallel. 2 of 4 nodes. Maximum run time 3 days.

Explanation of the partitions

Compute nodes are grouped in multiple partitions and each partition can be considered as a job queue. Partitions can have multiple constraints and restrictions. For example access for certain partitions can be limited by the user/group or the maximum running time can restricted.

Serial partition is the default partition for all jobs that user submits. User can reserve maximum of 1 nodes for his/her job. Default run time is 5 minutes and maximum 3 days.

Longrun partition is for long running jobs and only one node is for this usage. Default run time 5 minutes and maximum 14 days.

Parallel partition is for parallel jobs that can span over multiple nodes (MPI jobs for example). User can reserve 2 nodes (minimum and maximum). Default run time is 5 minutes and maximum 2 days.

Getting started

See Slurm usage instruction from Slurm Workload Manager.

infrastructure/sampo.txt · Last modified: 14.11.2019 10:48 by Juha Kekäläinen