Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
🚀🚀🚀
Parallel Julia

Julia model of distributed computation

Basic parallel processing

Using Threads:

  1. start Julia with parameter -t N
  2. parallelize (some) loops with Threads.@threads
a = zeros(100000)
Threads.@threads for i = eachindex(a)
    a[i] = hardfunction(i)
end

Using Distributed:

using Distributed
addprocs(N)
newVector = pmap(myFunction, myVector)

We will use the Distributed approach.

Managing your workers

using Distributed
addprocs(4)

myid()
workers()

Running commands on workers:

@spawnat 3 @info "Message from worker"

@spawnat :any myid()

Getting results from workers:

job = @spawnat :any begin sleep(10); return 123+321; end

fetch(job)

Cleaning up:

rmprocs(workers())

Processing lots of data items in parallel

datafiles = ["file$i.csv" for i=1:20]

@everywhere function process_file(name)
	println("Processing file $name")
	# ... do something ...
end

@sync for f in datafiles
	@async @spawnat :any process_file(f)
end

Gathering results from workers

items = collect(1:1000)

@everywhere compute_item(i) = 123 + 321*i

pmap(compute_item, items)

💡💡💡 Doing manually with @spawnat:

futures = [@spawnat :any compute_item(item) for item in items]

fetch.(futures)

How to design for parallelization?

Recommended way: Utilize the high-level looping primitives!

  • use map, parallelize by just switching to pmap
  • use reduce or mapreduce, parallelize by just switching to dmapreduce (DistributedData.jl)

💡 Parallel → distributed processing

It is very easy to organize multiple computers to work for you!

You need a working ssh connection:

user@pc1 $ ssh server1
Last login: Wed Jan 13 15:29:34 2021 from 2001:a18:....
user@server $ _

Spawning remote processes on remote machines:

julia> using Distributed
julia> addprocs([("server1", 10), ("pc2", 2)])

Benefit: No additional changes to the parallel programs!

💻 🇱🇺 🧮 💿
Utilizing ULHPC 💡

What does ULHPC look like?


hpc-docs.uni.lu/systems/iris

Making a HPC-compatible Julia script

Main challenges:

  1. discover the available resources
  2. spawn worker processes at the right place
using ClusterManagers

addprocs_slurm(parse(Int, ENV["SLURM_NTASKS"]))

# ... continue as usual

Scheduling the script

Normally, you write a "batch script" and add it to a queue using sbatch.

Script in runAnalysis.sbatch:

#!/bin/bash
# SBATCH -J MyAnalysisInJulia
# SBATCH -n 10
# SBATCH -c 1
# SBATCH -t 30
# SBATCH --mem-per-cpu 4G

julia runAnalysis.jl

You start the script using:

 $ sbatch runAnalysis.sbatch
🫐 🍎 🍈 🍇
Questions?

Lets do some hands-on problem solving (expected around 15 minutes)