Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
Parallel Julia on HPCs

Julia model of distributed computation

What does ULHPC look like?


hpc-docs.uni.lu/systems/iris

Basic parallel processing

Using Threads:

  1. start Julia with parameter -t N
  2. parallelize (some) loops with Threads.@threads
a = zeros(100000)
Threads.@threads for i = eachindex(a)
    a[i] = hardfunction(i)
end

Using Distributed:

using Distributed
addprocs(N)
newVector = pmap(function, oldVector)

We will use the Distributed approach.

How to design for parallelization?

  • Divide software into completely independent parts
    • avoid shared writeable state (to allow reentrancy)
    • avoid global variables (to allow separation from the "mother" process)
    • avoid complicated intexing in arrays (to allow slicing)
    • avoid tiny computation steps (to allow high-yield computation)
  • Design for utilization of the high-level looping primitives
    • use map
    • use reduce or mapreduce
    • parallelize programs using pmap and dmapreduce (DistributedData.jl)
  • Decompose more advanced programs into tasks with dependencies
    • Dagger.jl
    • make -jN may be a surprisingly good tool for parallelization!

Parallel → distributed processing

You need a working ssh connection to the server, ideally with keys:

user@pc1 $ ssh server1
Last login: Wed Jan 13 15:29:34 2021 from 2001:a18:....
user@server $ _

Spawning remote processes on remote machines:

julia> using Distributed
julia> addprocs([("server1", 10), ("pc2", 2)])

Benefit: No additional changes to the parallel programs!

Making a HPC-compatible script

Main problems:

  1. discover the available resources
  2. spawn worker processes at the right place
using ClusterManagers

addprocs_slurm(parse(Int, ENV["SLURM_NTASKS"]))

# ... continue as usual

Scheduling the script

Normally, you write a "batch script" and add it to a queue using sbatch.

Script in runAnalysis.sbatch:

#!/bin/bash
# SBATCH -J MyAnalysisInJulia
# SBATCH -n 10
# SBATCH -c 1
# SBATCH -t 30
# SBATCH --mem-per-cpu 4G

julia runAnalysis.jl

You start the script using:

 $ sbatch runAnalysis.sbatch
Questions?

Lets do some hands-on problem solving (expected around 15 minutes)