Skip to content
Snippets Groups Projects
Commit 9c2278b0 authored by Miroslav Kratochvil's avatar Miroslav Kratochvil :bicyclist:
Browse files

remove the stuff we didn't make the first time

parent 4ff38b4e
No related branches found
No related tags found
2 merge requests!160[julia] 2nd lecture,!159[julia 1] remove the stuff we didn't make the first time
<div class=leader>
<i class="twa twa-bar-chart"></i>
<i class="twa twa-blue-book"></i>
<i class="twa twa-computer-disk"></i>
<i class="twa twa-chart-increasing"></i><br>
Working with data
</div>
# What do we usually need?
- we need a comfortable abstraction over "long" tabular data → *data frames*
- we need to get the data in and out → *IO functions*
- we need to make pictures → *plotting packages*
# Writing files
All at once:
```julia
write("file.txt", "What a string!\n")
write("data.bin", UInt32[1,2,3])
```
By parts (with streaming, etc.):
```julia
f = open("file.txt", "w")
println(f, "This is my string!")
# ...
close(f)
```
Better:
```julia
open("file.txt", "a") do f
println(f, "The string again!")
# ...
end
```
# Reading files
All at once:
```julia
read("file.txt", String)
open(x -> collect(readeach(x, UInt32)), "data.bin", "r")
```
Process all lines:
```julia
for line in eachline("file.txt")
println("got a line: " * line)
end
```
Manually:
```julia
open("inputs.txt", "r") do io
a = parse(Int, readline(io))
b = parse(Int, readline(io))
println("$a * $b = $(a*b)")
end
```
Memory-consuming alternative: `readlines`
# Extremely useful: string interpolation
Pasting strings manually is _boring_.
```julia
"String contains $val and $otherval."
do_something("input$i.txt", "output$i.txt")
"$str <<-->> $(reverse(str))"
```
The conversion to actual strings is done using `show()`. (Customize by overloading!)
# Reading and writing structured data
```julia
using DelimitedFiles
mtx = readdlm("matrix.tsv", '\t', Int, '\n') # returns a Matrix{Int}
writedlm("matrix.csv", ',')
```
# Data frames
Package `DataFrames.jl` provides a work-alike of the data frames from
other environments (pandas, `data.frame`, tibbles, ...)
```julia
using DataFrames
mydata = DataFrame(id = [32,10,5], text = ["foo", "bar", "baz"])
mydata.text
mydata.text[mydata.id .>= 10]
```
Main change from `Matrix`: *columns are labeled and their types differ*, also entries may be missing
# DataFrames
Popular way of importing data:
```julia
using CSV
df = CSV.read("database.csv", DataFrame) # can also do a Matrix
CSV.write("backup.csv", df)
```
Popular among computer users:
```julia
using XLSX
x = XLSX.readxlsx("important_results.xls")
XLSX.sheetnames(x)
DataFrame(XLSX.gettable(x["Results sheet"])...)
```
<small>(Please do not export data to XLSX.)</small>
# Plotting
<center>
<img src="slides/img/unicodeplot.png" width="40%" />
</center>
# Usual plotting packages
- `UnicodePlots.jl` (useful in terminal, https://github.com/JuliaPlots/UnicodePlots.jl)
- `Plots.jl` (matplotlib workalike, works with Plotly)
- `GLMakie.jl` (interactive plots)
- `CairoMakie.jl` (PDF export of Makie plots)
Native `ggplot` and `cowplot` ports are in development.
Gallery available: https://makie.juliaplots.org
# Exercise: write a median function
Approaches (let's do at least 2)
- sort and pick
- quick-median (aka. lightweight sort&pick, as in quicksort)
- approximation-improving-median
- this parallelizes well
- at which point is this faster than quick-median?
- did we lose any property?
- what about median strings?
# Homework 0
Make a package from the "trivial" median algorithm.
- Use julia's `]generate`.
- Have a look at how unit tests are done
- Make sure this unittest works:
```julia
using MagicMedian
@testset "median funcitonality test" begin
@test simplemedian([1,2,3,4]) >= 2
@test simplemedian([1,2,3,4]) <= 3
@test simplemedian([2,6,8]) == 6
end
```
- Zip the package (or `.tar.gz` it or whatever) and upload it to Moodle
How to run unit tests?
- place unittest code into `test/runtests.jl` in the package file
- run `]test MagicMedian`
# Exercise: let's play a maze game
- Read a maze from a file (let's have `0`s for a corridor and `1`s for a wall)
- Draw it to console using `.` and `#`, add a border.
- Make a function that annotates the whole maze based on how many (axis-aligned) steps a person (maze inhabitant) needs to take from some point
- let's use the slow <i class="twa twa-water-wave"></i>wave<i class="twa twa-water-wave"></i> algorithm
- Use `UnicodePlots` to plot interesting stuff
- shortest path length distribution
- shortest-path-length heatmap of the maze
- "most blocking walls" (how much time to reach the father side of the wall could be gained by removing the wall?)
# Exercise: Tricky questions about the maze
What is the slowest part of the "wave" solution?
How can we make it faster?
# Homework 1
Make a *faster* version of the maze distance computation.
- any definiton of "faster solution" works; keep it terse
- recommended way:
- Use `DataStructures.jl` or any other Julia package to get a priority queue working
- Implement some simple version of Dijkstra's algorithm
- alternatively, try to optimize our naive array algorithm
- measure your speedup with `@time` on mazes of at least `128*128` tiles
- data for testing will be available (probably on Moodle)
Submission:
- wrap your code into a package `MagicMaze`
- include a function `maze_distance_map(x::Int, y::Int)::Matrix{Int}`
- write a unit test that demonstrates the result
- pack the package and upload it to Moodle
......@@ -3,7 +3,5 @@
{ "filename": "0a-overview.md" },
{ "filename": "1a-intro.md" },
{ "filename": "1b-bootstrap.md" },
{ "filename": "1c-language.md" },
{ "filename": "1d-io.md" },
{ "filename": "1e-exercises.md" }
{ "filename": "1c-language.md" }
]
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment