diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/5a-para.md b/2023/2023-03-21_ProgrammingWithJulia-5/slides/5a-para.md new file mode 100644 index 0000000000000000000000000000000000000000..4a59cfbab433e127dfdadef8e20bf3fa8b031a30 --- /dev/null +++ b/2023/2023-03-21_ProgrammingWithJulia-5/slides/5a-para.md @@ -0,0 +1,222 @@ + +<div class=leader> +<i class="twa twa-rocket"></i> +<i class="twa twa-rocket"></i> +<i class="twa twa-rocket"></i> +<i class="twa twa-rocket"></i> +<br> +Parallel Julia +</div> + + + +# Usual ways to gain performance: + +1. Do not work on data that is too far *(cache)* +2. Do not waste energy on organizing trivial stuff *(SIMD/SIMT)* +3. Do not waste time waiting for data *(HT/GPU)* +4. Organize the computation so that more computers don't trip over each other on the task *(SMP)* +5. Move the computers closer to data *(distributed computing)* + +Let's spend a moment explaining these technologies... + + + +# Data distance and physical limits + +<div class=leader> +How far does light go in 1 cycle of a 3GHz CPU? +</div> + + + +# CPUs vs GPUs (SIMD vs SIMT) + +<center> +<img src="slides/img/cpu.png" width="40%" /> +<img src="slides/img/gpu.png" width="40%" /> +</center> + + + +# SIMD problem: Your Data Looks Like Thisâ„¢ + +<center> +<img src="slides/img/maze.jpg" width="75%" /> +</center> + + + +# Parallel programming + +<center> +<img src="slides/img/threads.jpeg" width="50%" /><br> +(notice the false sharing at threaddog 5) +</center> + +Distributed computing can give you: + +- more memory +- more total memory bandwidth (!) +- more synchronization problems + + + +# Julia tools + +Let's implement a matrix multiplication manually, and try: + +- checking if SIMD instructions are used +- reordering the loops +- using smaller floats +- tiling +- `@threads` + +<center><img src="slides/img/tiling.jpeg" width="33%" /></center> + + + +# Distributed computing with Julia + +```julia +using Distributed +addprocs(10) + +pmap(myfunction, mydata, workers=workers()) +``` + +(Spoiler: you can add processes on remote machines using SSH.) + + + +# Distributed computing with Julia (on the HPC) + +```julia +using Distributed, ClusterManagers +addprocs_slurm(parse(Int, ENV["SLURM_NTASKS"])) + +pmap(myfunction, mydata, workers=workers()) +``` + +You typically want to load the data locally at the workers. + +For more complex schemes: +- `Dagger.jl` provides complex synchronization/task dependency schemes +- `DistributedData.jl` provides primitives for manipulating the data precisely + + + +# How to use a GPU? + +```julia +using CUDA + +A = cu(randn(1000,1000)); +B = cu(randn(1000,1000)); +A = A * B; +``` + +...transparently uses CUDA, cuBLAS, cuSPARSE, cuDNN and many other libraries to do stuff quicker. + + + +# How to actually program a GPU? + +CUDA.jl can compile Julia code into CUDA kernel code. + +```julia +function fill_with_indexes!(array) + index = threadIdx().x + blockDim().x * (blockIdx().x - 1) + stride = gridDim().x * blockDim().x + for i = index:stride:length(arr) + arr[i] = i + end + return +end + +A = cu(zeros(9999999)) +@cuda threads=1024 blocks=16 fill_with_indexes!(A) +``` + + + +# Homeworks + +## Homework 2 update + +Feel free to apply whatever we did today for bonus points. + +- think about cache efficiency +- remember that most forces are repulsive forces +- `@threads` may help for larger graphs + +## Homework 3 + +- We will learn to handle ugly and hairy data. +- Last lecture is going to go over the methodology, but you can try earlier. + + + +# Homework 3 + +We will simulate a cookie distribution network: +- there's one central *cookie factory* +- cookie *transports* move cookies in loads of N cookies + - they require 1 cookie to sustain themselves for each batch of cookies transported + - N is different for each transport +- cookie *distribution points* divide incoming cookies among next paths in networks + - exact distribution ratio among the paths + - no cookies consumed +- cookie *munchers* are at the end of the transport chain + - each of them munches N cookies per day (again different for each muncher) + + + +# Homework 3 (Data) + +```json +{ type: "distribution point", + serves: [ + { type: "muncher", consumption: 3 }, + { type: "transport", + capacity: 5, + serves: { + type: "distribution point", + serves: [ + { type: "muncher", consumption: 7 }, + { type: "muncher", consumption: 2 }, + { type: "muncher", consumption: 1 } + ] + ratios: [1,1,1] + } + } + ], + ratios: [1,5] +} +``` + +Pretty-printed (one possibility): +``` +1 -> munch 3 +5 -> transport 5 -> 1 -> munch 7 + 1 -> munch 2 + 1 -> munch 1 +``` + + + +# Homework 3 (Assignment) + +Tasks: +- read the cookie network from a JSON file (we'll provide example data, use `JSON.jl`) +- make a nice data structure to hold this problem, make sure the input is valid +- make functions that: + - find the length of the *longest chain* (by transport "steps") from the factory to the muncher + - find out how many cookies the factory needs to produce daily so that *all munchers are fed* + - construct a network where all *transports are split in half*, each half with half cookie consumption + - find out *how many cookies are wasted* by being routed to munchers who can't eat them + - construct a network where the *distribution points are balanced* so that no cookies get wasted + - *BONUS: print the network nicely* +- for simplicity, data structures and functions may be recursive + - performance optimization _is not_ a goal + - nice short code _is_ a goal diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/cpu.png b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/cpu.png new file mode 100644 index 0000000000000000000000000000000000000000..65bf08083363464aa64e872580cb5c34fabb22ab Binary files /dev/null and b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/cpu.png differ diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/favicon.ico b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/favicon.ico new file mode 100644 index 0000000000000000000000000000000000000000..5f340eacbd179e33bf4d529bec139e5fdc1b8b43 Binary files /dev/null and b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/favicon.ico differ diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/gpu.png b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/gpu.png new file mode 100644 index 0000000000000000000000000000000000000000..c11a53d071e4072a2eee6ccbf357c0c3168b5584 Binary files /dev/null and b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/gpu.png differ diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/julia.svg b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/julia.svg new file mode 100644 index 0000000000000000000000000000000000000000..73d8f42f3b312973a9a369c0e4f712e9f92a5537 --- /dev/null +++ b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/julia.svg @@ -0,0 +1 @@ +<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 153.14 98.64"><defs><style>.cls-1{fill:#1a1a1a}.cls-2{fill:#4d64ae}.cls-3{fill:#ca3c32}.cls-4{fill:#9259a3}.cls-5{fill:#399746}</style></defs><title>Asset 2</title><g id="Layer_2" data-name="Layer 2"><g id="Layer_1-2" data-name="Layer 1"><g id="layer1"><g id="g3855"><g id="g945"><g id="g984"><g id="g920"><path id="path3804" d="M93.14,80.94h-13V21.13l13-3.58Z" class="cls-1"/><g id="g898"><g id="g893"><path id="path19" d="M22.17,36.33a8.9,8.9,0,1,1,8.9-8.9A8.91,8.91,0,0,1,22.17,36.33Z" class="cls-2"/></g><path id="path3819" d="M29.14,80.83A26.48,26.48,0,0,1,27.83,90a12.12,12.12,0,0,1-3.62,5.4A12.33,12.33,0,0,1,18.57,98a36.64,36.64,0,0,1-7.32.67,22.47,22.47,0,0,1-4.81-.47A13,13,0,0,1,2.9,96.93,6,6,0,0,1,.76,95.07,3.62,3.62,0,0,1,0,92.88,4.26,4.26,0,0,1,1.59,89.5a6.47,6.47,0,0,1,4.33-1.35,5,5,0,0,1,1.87.32,6,6,0,0,1,1.43.79,12,12,0,0,1,1.16,1.07c.31.4.59.77.83,1.12A7.58,7.58,0,0,0,12.72,93a2.3,2.3,0,0,0,1.15.4,1.85,1.85,0,0,0,1-.28,2,2,0,0,0,.71-1,7.18,7.18,0,0,0,.4-1.91,23.12,23.12,0,0,0,.16-3.06V40.48l13-3.58Z" class="cls-1"/></g><path id="path3802" d="M48.14,37.94V68a6.14,6.14,0,0,0,.47,2.39A6.45,6.45,0,0,0,50,72.24a7,7,0,0,0,2,1.27,6.12,6.12,0,0,0,2.4.48,4.2,4.2,0,0,0,1.61-.4,8.42,8.42,0,0,0,1.8-1.12,13.27,13.27,0,0,0,1.81-1.66,12.92,12.92,0,0,0,1.61-2.11V37.94h13v43h-13v-4a22.47,22.47,0,0,1-5.43,3.53,13.62,13.62,0,0,1-5.59,1.28,16.52,16.52,0,0,1-5.9-1,15.59,15.59,0,0,1-4.76-2.89,13.56,13.56,0,0,1-3.17-4.28,12.41,12.41,0,0,1-1.15-5.29V37.94Z" class="cls-1"/><g id="g905"><g id="g890"><path id="path13" d="M105.79,36.33a8.9,8.9,0,1,1,8.91-8.9A8.91,8.91,0,0,1,105.79,36.33Z" class="cls-3"/><path id="path25" d="M127.18,36.33a8.9,8.9,0,1,1,8.91-8.9A8.91,8.91,0,0,1,127.18,36.33Z" class="cls-4"/><path id="path31" d="M116.49,17.8a8.9,8.9,0,1,1,8.9-8.9,8.89,8.89,0,0,1-8.9,8.9Z" class="cls-5"/></g><path id="path3823" d="M100.14,40.6l13-3.58V80.94h-13Z" class="cls-1"/></g><path id="path3808" d="M140.14,58.77a37.64,37.64,0,0,0-3.77,1.87,21.89,21.89,0,0,0-3.46,2.3,12.77,12.77,0,0,0-2.55,2.67,5.12,5.12,0,0,0-1,2.94,8.53,8.53,0,0,0,.32,2.34,7,7,0,0,0,.87,1.91,5.15,5.15,0,0,0,1.23,1.27,2.67,2.67,0,0,0,1.51.48,6.3,6.3,0,0,0,3.18-1,41.31,41.31,0,0,0,3.62-2.47Zm13,22.17h-13V77.52c-.71.61-1.42,1.17-2.11,1.67a14.2,14.2,0,0,1-2.3,1.35,13.56,13.56,0,0,1-2.82.88,19.75,19.75,0,0,1-3.78.31,16,16,0,0,1-5.33-.83,12.23,12.23,0,0,1-4-2.31,10.23,10.23,0,0,1-2.51-3.53,11,11,0,0,1-.87-4.37,10.27,10.27,0,0,1,.91-4.42,13.11,13.11,0,0,1,2.55-3.57,19.36,19.36,0,0,1,3.77-2.86,40.26,40.26,0,0,1,4.65-2.31c1.67-.69,3.4-1.32,5.17-1.91l5.25-1.71,1.43-.31V49.34a11.91,11.91,0,0,0-.44-3.45,5.82,5.82,0,0,0-1.15-2.31,4,4,0,0,0-1.79-1.31,6.6,6.6,0,0,0-2.34-.4,7.38,7.38,0,0,0-2.59.4,4.37,4.37,0,0,0-1.67,1.11,3.94,3.94,0,0,0-.91,1.59,6.52,6.52,0,0,0-.28,2,9.51,9.51,0,0,1-.28,2.35,4.85,4.85,0,0,1-.91,2A4.47,4.47,0,0,1,126,52.6a6.84,6.84,0,0,1-2.9.52,7.51,7.51,0,0,1-2.51-.4,6.16,6.16,0,0,1-1.91-1.15,6,6,0,0,1-1.27-1.75,5.59,5.59,0,0,1-.44-2.18,6.42,6.42,0,0,1,1.51-4.1,13.16,13.16,0,0,1,4.06-3.3,23.45,23.45,0,0,1,5.92-2.14,31.07,31.07,0,0,1,7.12-.8,32.21,32.21,0,0,1,7.87.84,16.37,16.37,0,0,1,5.49,2.34,9.55,9.55,0,0,1,3.18,3.66,10.91,10.91,0,0,1,1,4.81Z" class="cls-1"/></g></g></g></g></g></g></g></svg> \ No newline at end of file diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/maze.jpg b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/maze.jpg new file mode 100644 index 0000000000000000000000000000000000000000..5458fac9856d8e1b3b2ef539cc333ee198ca06a7 Binary files /dev/null and b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/maze.jpg differ diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/threads.jpeg b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/threads.jpeg new file mode 100644 index 0000000000000000000000000000000000000000..63f6095b1dfbd40611ffaa59bf7e2fbc9e5d64eb Binary files /dev/null and b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/threads.jpeg differ diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/tiling.jpeg b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/tiling.jpeg new file mode 100644 index 0000000000000000000000000000000000000000..30972f56bf4519fe2aa0503260d9dcf417aaa7e0 Binary files /dev/null and b/2023/2023-03-21_ProgrammingWithJulia-5/slides/img/tiling.jpeg differ diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/index.md b/2023/2023-03-21_ProgrammingWithJulia-5/slides/index.md new file mode 100644 index 0000000000000000000000000000000000000000..ff9a3e71565b8cb9716e45969889a7a70919ad4e --- /dev/null +++ b/2023/2023-03-21_ProgrammingWithJulia-5/slides/index.md @@ -0,0 +1,28 @@ + +# Programming with Julia + +## March 2023 + +<div style="top: 6em; left: 0%; position: absolute;"> + <img src="theme/img/lcsb_bg.png"> +</div> + +<div style="top: 1em; left: 60%; position: absolute;"> + <img src="slides/img/julia.svg" height="200px"> + <h1 style="margin-top:3ex; margin-bottom:3ex;">5: Performance and parallelism</h1> + <h4> + Miroslav KratochvÃl<br> + Laurent Heirendt<br> + LCSB, DSSE<br> + </h4> +</div> + +<link rel="stylesheet" href="https://lcsb-biocore.github.io/icons-mirror/twemoji-amazing.css"> +<style> + code {border: 2pt dotted #f80; padding: .4ex; border-radius: .7ex; color:#444; } + .reveal pre code {border: 0; font-size: 18pt; line-height:27pt;} + em {color: #e02;} + li {margin-bottom: 1ex;} + div.leader {font-size:400%; line-height:120%; font-weight:bold; margin: 1em;} + section {padding-bottom: 10em;} +</style> diff --git a/2023/2023-03-21_ProgrammingWithJulia-5/slides/list.json b/2023/2023-03-21_ProgrammingWithJulia-5/slides/list.json new file mode 100644 index 0000000000000000000000000000000000000000..a410370cee33a492962c23d951bac478410a08c5 --- /dev/null +++ b/2023/2023-03-21_ProgrammingWithJulia-5/slides/list.json @@ -0,0 +1,4 @@ +[ + { "filename": "index.md" }, + { "filename": "5a-para.md" } +]