-
Miroslav Kratochvil authoredMiroslav Kratochvil authored
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
distributed.md 2.43 KiB
Parallel Julia on HPCs
Julia model of distributed computation
What does ULHPC look like?

hpc-docs.uni.lu/systems/iris
Basic parallel processing
Using Threads
:
- start Julia with parameter
-t N
- parallelize (some) loops with
Threads.@threads
a = zeros(100000)
Threads.@threads for i = eachindex(a)
a[i] = hardfunction(i)
end
Using Distributed
:
using Distributed
addprocs(N)
newVector = pmap(function, oldVector)
We will use the Distributed
approach.
How to design for parallelization?
-
Divide software into completely independent parts
- avoid shared writeable state (to allow reentrancy)
- avoid global variables (to allow separation from the "mother" process)
- avoid complicated intexing in arrays (to allow slicing)
- avoid tiny computation steps (to allow high-yield computation)
-
Design for utilization of the high-level looping primitives
- use
map
- use
reduce
ormapreduce
- parallelize programs using
pmap
anddmapreduce
(DistributedData.jl)
- use
- Decompose more advanced programs into tasks with dependencies
- Dagger.jl
-
make -jN
may be a surprisingly good tool for parallelization!
Parallel → distributed processing
You need a working ssh
connection to the server, ideally with keys:
user@pc1 $ ssh server1
Last login: Wed Jan 13 15:29:34 2021 from 2001:a18:....
user@server $ _
Spawning remote processes on remote machines:
julia> using Distributed
julia> addprocs([("server1", 10), ("pc2", 2)])
Benefit: No additional changes to the parallel programs!
Making a HPC-compatible script
Main problems:
- discover the available resources
- spawn worker processes at the right place
using ClusterManagers
addprocs_slurm(parse(Int, ENV["SLURM_NTASKS"]))
# ... continue as usual
Scheduling the script
Normally, you write a "batch script" and add it to a queue using sbatch
.
Script in runAnalysis.sbatch
:
#!/bin/bash
# SBATCH -J MyAnalysisInJulia
# SBATCH -n 10
# SBATCH -c 1
# SBATCH -t 30
# SBATCH --mem-per-cpu 4G
julia runAnalysis.jl
You start the script using:
$ sbatch runAnalysis.sbatch
Questions?
Lets do some hands-on problem solving (expected around 15 minutes)