using LinearAlgebra # in-built
using Random # in-built
using Statistics # in-built
using Plots
using DataFrames
using CSV
Unit 3 - Numerics and Monte Carlo
In this unit we explore multiple topics from numerical mathematics and simulation.
Some of the basic packages we use are:
We also use other packages along the way.
1 Differential Equations
Julia’s differential equations package, DifferentialEquations.jl, is powerful and sits at the forefront of the the much larger, SciML ecosystem. See the DifferentialEquations.jl docs.
As an introductory example, here is a simple (linear) physical system.
\[ \frac{d}{dt}x(t) = A x(t). \]
where the matrix \(A\) is,
= 1.2, 0.3, 2.0 # some constants
k, b, M = [0 1;
A -k/M -b/M]
2×2 Matrix{Float64}:
0.0 1.0
-0.6 -0.15
Lets first solve this “manually” with a matrix exponential:
= [8., 0.0] # initial condition
init_x = 50.0
t_end = 0:0.1:t_end
t_range
= [exp(A*t)*init_x for t in t_range] manual_sol
501-element Vector{Vector{Float64}}:
[8.0, 0.0]
[7.976131477231489, -0.4759416594600485]
[7.9051424045917065, -0.9419645483609713]
[7.788156860178615, -1.3954424545440751]
[7.626555969115729, -1.8338629358935703]
[7.421965855822404, -2.2548405811977332]
[7.1762443173550725, -2.6561293174622302]
[6.891466316183785, -3.035633704615905]
[6.569908396330972, -3.391419165455919]
[6.214032131660425, -3.721721105793306]
⋮
[0.19889267077495587, -0.03574512200121027]
[0.19475497135258296, -0.04693949404546087]
[0.1895196445252878, -0.05768780813280151]
[0.18323419379537167, -0.06793272857910464]
[0.17595166602713536, -0.07762076964073986]
[0.16773025316892345, -0.08670255655678863]
[0.15863286927994397, -0.0951330582352686]
[0.14872670574716249, -0.1028717906205154]
[0.13808276766445346, -0.10988298996773838]
Now using DifferentialEquations.jl
:
using DifferentialEquations
linear_RHS(x, Amat, t) = Amat*x
= ODEProblem(linear_RHS, init_x, (0,t_end), A)
prob = solve(prob) sol
retcode: Success
Interpolation: 3rd order Hermite
t: 58-element Vector{Float64}:
0.0
0.00020830729492146818
0.00229138024413615
0.023122109736282967
0.1335966096654432
0.36577241373307684
0.6995799181070914
1.127561697405902
1.6594365833770348
2.2581447011496483
⋮
42.19249397072834
43.31781573881729
44.3258472053866
45.440901129815444
46.41599112001244
47.56207282437716
48.524470046796786
49.655495306375656
50.0
u: 58-element Vector{Vector{Float64}}:
[8.0, 0.0]
[7.999999895860455, -0.00099985939035268]
[7.999987400430654, -0.010996729462204742]
[7.998718399683977, -0.11078795825017768]
[7.957487282908296, -0.6337486075041383]
[7.686793838246466, -1.6856482354542235]
[6.892741024708577, -3.034087776569335]
[5.29050490801111, -4.370269514081534]
[2.6878641968695924, -5.265604384991069]
[-0.49599425280065423, -5.18014217637776]
⋮
[0.17831355229755946, -0.23598212217525907]
[-0.09646290644067074, -0.22143258346715539]
[-0.25661202733650557, -0.08354166294122825]
[-0.24694669923436294, 0.09468909960471991]
[-0.10495115521536133, 0.18043758900386436]
[0.09769849595358904, 0.15051526341051585]
[0.19571707984067568, 0.04621306354585156]
[0.17120870719801237, -0.08299036332452805]
[0.13771692291702747, -0.1100552504192182]
Plotting the solution:
= plot(first.(manual_sol), last.(manual_sol),
p1 =:blue, label="Manual trajectory")
c= scatter!(first.(sol.u), last.(sol.u),
p1 =:red, ms = 5, msw=0, label="DiffEq package")
c= scatter!([init_x[1]], [init_x[2]],
p1 =:black, ms=10, label="Initial state", xlims=(-7,9), ylims=(-9,7),
c=:equal, xlabel="Displacement", ylabel="Velocity")
ratio= plot(t_range, first.(manual_sol),
p2 =:blue, label="Manual trajectory")
c= scatter!(sol.t, first.(sol.u),
p2 =:red, ms = 5, msw=0, label="DiffEq package")
c= scatter!([0], [init_x[1]],
p2 =:black, ms=10, label="Initial state", xlabel="Time",
c="Displacement")
ylabelplot(p1, p2, size=(800,400), legend=:topright)
Here is a non-linear compartmental epidemics (SEIR) model:
= 0.25, 0.2, 0.1
β, δu, γ = 0.025
initialInfect println("R0 = ", β/γ)
= [1-initialInfect, 0.0, initialInfect, 0.0]
init_x = 100.0
tEnd
RHS(x,parms,t) = [ -β*x[1]*x[3],
*x[1]*x[3] - δu*x[2],
β*x[2] - γ*x[3],
δu*x[3] ]
γ
= ODEProblem(RHS, init_x, (0,tEnd), 0)
prob = solve(prob)
sol println("Final infected proportion= ", sol.u[end][4])
plot(sol.t,((x)->x[1]).(sol.u),label = "Susceptible", c=:green)
plot!(sol.t,((x)->x[2]).(sol.u),label = "Exposed", c=:blue)
plot!(sol.t,((x)->x[3]).(sol.u),label = "Infected", c=:red)
plot!(sol.t,((x)->x[4]).(sol.u),label = "Removed", c=:yellow,
= "Time", ylabel = "Proportion",legend = :top) xlabel
R0 = 2.5
Final infected proportion= 0.8622039418944475
2 Simulation of Continuous Time Markov Chains
A similar epidemics model, now stochastic:
using Distributions
function simulateSIRDoobGillespie(β ,δu, γ, I0, M, T)
= 0.0, M-I0, 0, I0, 0
t, S, E, I, R = [0.0], [S], [E], [I], [R]
tValues, sValues, eValues, iValues, rValues while t<T
= β*I*S
infectionRate = δu*E
symptomRate = γ*I
removalRate = infectionRate + symptomRate + removalRate
totalRate = [infectionRate, symptomRate, removalRate]/totalRate
probs += rand(Exponential(1/totalRate))
t = rand()
u if u < probs[1]
-= 1; E += 1
S elseif u < probs[1] + probs[2]
-=1; I += 1
E else
-= 1; R += 1
I end
push!(tValues, t)
push!(sValues, S); push!(eValues, E); push!(iValues, I); push!(rValues, R)
== 0 && break
I end
return [tValues, sValues, eValues, iValues, rValues]
end
simulateSIRDoobGillespie (generic function with 1 method)
Now le’s simulate it:
Random.seed!(0)
= 0.25, 0.4, 0.1
β, δu, γ = 0.025
initialInfect = 1000
M = floor(Int, initialInfect*M)
I0 = 30
N
= simulateSIRDoobGillespie(β/M,δu,γ,I0,M,Inf)
tV,sV,eV,iV,rV = tV[end]
lastT
= [simulateSIRDoobGillespie(β/M,δu,γ,I0,M,Inf)[5][end] for _ in 1:N]/M; finals
And plot it:
= plot(tV,sV/M,label = "Susceptible", c=:green)
p1 plot!(tV,eV/M,label = "Exposed", c=:blue)
plot!(tV,iV/M,label = "Infected",c=:red)
plot!(tV,rV/M,label = "Removed", c=:yellow,
= "Time", ylabel = "Proportion",
xlabel = :topleft, xlim = (0,lastT*1.05))
legend scatter!(lastT*1.025*ones(N),finals, c = :yellow,label= "Final Infected")
A more serious package for simulation of such “jump processes” is JumpProcesses.jl. See also the docs. Related is the Catalyst.jl package, focusing on chemical reactions.
3 Linear Algebra
Julia provides strong in-built support for Linear Algebra, with syntax somewhat similar to MATLAB (see Julia syntax comparison with MATLAB). See the Linear Algebra Documentation.
The in-built linear algebra capabilities are primarily powered by the BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra Package) libraries, which are industry-standard, highly optimized libraries for performing core matrix and vector operations as well as advanced factorizations and eigenvalue problems.
Julia ships with its own OpenBLAS build by default, but users can swap in other BLAS/LAPACK backends such as MKL.jl for Intel’s Math Kernel Library, enabling optimized performance on compatible hardware.
See also the JuliaLinearAlgebra organization.
Let’s get going with basic linear algebra.
= [1 2 3;
A 4 1 6;
7 8 1]
@show det(A)
@show tr(A)
@show eigvals(A)
factorize(A)
det(A) = 104.0
tr(A) = 3
eigvals(A) = [-6.214612641961068, -1.5540265964847833, 10.768639238445843]
LU{Float64, Matrix{Float64}, Vector{Int64}}
L factor:
3×3 Matrix{Float64}:
1.0 0.0 0.0
0.571429 1.0 0.0
0.142857 -0.24 1.0
U factor:
3×3 Matrix{Float64}:
7.0 8.0 1.0
0.0 -3.57143 5.42857
0.0 0.0 4.16
The I
matrix can be used freely in most situations:
+ 100I A
3×3 Matrix{Int64}:
101 2 3
4 101 6
7 8 101
Note that Julia matrices are column oriented.
= collect(1:4) a
4-element Vector{Int64}:
1
2
3
4
= reshape(a, 2, 2) A
2×2 Matrix{Int64}:
1 3
2 4
In some circumstances thinking about column orientation vs. row orientation may affect performance:
using BenchmarkTools
function sum_rows(data, n, m)
= zero(eltype(data))
s for i in 1:n
for j in 1:m
+= data[i,j]
s end
end
return s
end
function sum_columns(data, n, m)
= zero(eltype(data))
s for j in 1:m
for i in 1:n
+= data[i,j]
s end
end
return s
end
Random.seed!(0)
= 500, 10^6
n, m = rand(n,m)
data
#dry run for precompilation
sum_columns(data, n, m)
sum_rows(data, n, m)
println("Summing with the column iteration moving fastest")
@btime sum_columns(data, n, m)
println("Summing with the row iteration moving fastest")
@btime sum_rows(data, n, m);
Summing with the column iteration moving fastest
389.419 ms (1 allocation: 16 bytes)
Summing with the row iteration moving fastest
2.315 s (1 allocation: 16 bytes)
3.1 A Variety of Ways for doing least-squares
Here is a very simple (X,Y) dataset.
= "../data/L1L2data.csv" |> CSV.File |> DataFrame data
Row | X | Y |
---|---|---|
Float64 | Float64 | |
1 | 2.9 | 3.0 |
2 | 4.3 | 2.9 |
3 | 5.2 | 5.3 |
4 | 6.9 | 7.8 |
5 | 8.3 | 5.5 |
Let’s fit a regression model with the GLM.jl package:
using GLM
= glm(@formula(Y ~ X), data, Normal())
modelK = coef(modelK) b0K, b1K
2-element Vector{Float64}:
0.9449358690844767
0.7164971251658556
So we see the constant (intercept) term and the slope term.
Now let’s retrieve the same least squares estimates in many other ways:
= data[:,1], data[:,2]
xVals, yVals = length(xVals)
n = [ones(n) xVals]
A
# Approach A
= mean(xVals),mean(yVals)
xBar, yBar = ones(n)'*(xVals.-xBar).^2 , dot(xVals.-xBar,yVals.-yBar)
sXX, sXY = sXY/sXX
b1A = yBar - b1A*xBar
b0A
# Approach B
= cor(xVals,yVals)*(std(yVals)/std(xVals))
b1B = yBar - b1B*xBar
b0B
# Approach C
= A'A \ A'yVals
b0C, b1C
# Approach D
= inv(A'*A)*A'
Adag = Adag*yVals
b0D, b1D
# Approach E
= pinv(A)*yVals
b0E, b1E
# Approach F
= A\yVals
b0F, b1F
# Approach G
= qr(A)
F = F.Q, F.R
Q, R = (inv(R)*Q')*yVals
b0G, b1G
# Approach H
= svd(A)
F = F.V, Diagonal(1 ./ F.S), F.U'
V, Sp, Us = (V*Sp*Us)*yVals
b0H, b1H
# Approach I
= 0.002, 10^-6.
η, eps = [0,0], [1,1]
b, bPrev while norm(bPrev-b) >= eps
global bPrev = b
global b = b - η*2*A'*(A*b - yVals)
end
= b[1], b[2]
b0I, b1I
# Approach J
= lm(@formula(Y ~ X), data)
modelJ = coef(modelJ); b0J, b1J
We can see that all these ways yield the same solution:
println(round.([b0A,b0B,b0C,b0D,b0E,b0F,b0G,b0H,b0I,b0J,b0K], digits=3))
println(round.([b1A,b1B,b1C,b1D,b1E,b1F,b1G,b1H,b1I,b1J,b1K], digits=3))
[0.945, 0.945, 0.945, 0.945, 0.945, 0.945, 0.945, 0.945, 0.944, 0.945, 0.945]
[0.716, 0.716, 0.716, 0.716, 0.716, 0.716, 0.716, 0.716, 0.717, 0.716, 0.716]
3.2 Supporting linear algebra packages
Julia provides quite a-lot of linear algebra out of the box. Yet here are some additional packages that are sometimes useful:
- The
SparseArrays
library is in-built. Here is the documentation. - PDMats.jl for positive definite matrices (often used for covariance matrices).
- LinearSolve.jl provides additional fast implementations of linear solving algorithms. See this discussion.
- GenericLinearAlgebra.jl provides additional functionality.
4 Monte Carlo
Here is an example with common random numbers and using RNG (Random Number Generation) objects.
See the Random Numbers section in the Julia docs. Julia uses the Xorshift random number generator as a default. It was Mersenne Twister up to Julia 1.6. Good also to know also about StableRNGs.jl.
using Measures # just helps for plots
function create_path(rng::AbstractRNG, α::Real, n::Int=5000)
= 0.0
x = 0.0
y = Float64[]
xDat = Float64[]
yDat
for _ in 1:n
# random walk
= rand(rng, 1:4)
flip if flip == 1 # left
+= 1
x elseif flip == 2 # up
+= 1
y elseif flip == 3 # right
-= 1
x elseif flip == 4 # down
-= 1
y end
# bias toward upper-right
+= α
x += α
y
# add the result to the output
push!(xDat, x)
push!(yDat, y)
end
return (xDat, yDat)
end
= [0.0, 0.002, 0.004]
alpha_range = (xlabel = "x", ylabel = "y", xlims=(-150, 150), ylims=(-150, 150))
args
#Plot runs with same random numbers (common random numbers)\
= plot(create_path(Xoshiro(27), alpha_range[1]), c = :blue, label = "α=$(alpha_range[1])")
p1 = plot!(create_path(Xoshiro(27), alpha_range[2]), c = :red, label = "α=$(alpha_range[2])")
p1 = plot!(create_path(Xoshiro(27), alpha_range[3]), c = :green, label = "α=$(alpha_range[3])", title = "Same seed", legend = :topright; args...)
p1
#Plot runs with different random numbers
= Xoshiro(27)
rng = plot(create_path(rng, alpha_range[1]), c = :blue, label = "α=$(alpha_range[1])")
p2 = plot!(create_path(rng, alpha_range[2]), c = :red, label = "α=$(alpha_range[2])")
p2 = plot!(create_path(rng, alpha_range[3]), c = :green, label = "α=$(alpha_range[3])", title = "Different seeds", legend = :topright; args...)
p2
plot(p1, p2, size=(800, 400), margin=5mm)
Sometimes simulating with multiple RNGs can be of use:
using LaTeXStrings
= 10^2, 50, 10^3
N, K, M = 0.01:0.01:0.99
lamRange
prn(lambda,rng) = quantile(Poisson(lambda),rand(rng))
zDist(lam) = Uniform(0,2*(1-lam))
rv(lam,rng) = sum([rand(rng,zDist(lam)) for _ in 1:prn(K*lam,rng)])
rv2(lam,rng1,rng2) = sum([rand(rng1,zDist(lam)) for _ in 1:prn(K*lam,rng2)])
mEst(lam,rng) = mean([rv(lam,rng) for _ in 1:N])
mEst2(lam,rng1,rng2) = mean([rv2(lam,rng1,rng2) for _ in 1:N])
function mGraph0(seed)
= MersenneTwister(seed)
singleRng mEst(lam,singleRng) for lam in lamRange]
[end
mGraph1(seed) = [mEst(lam,MersenneTwister(seed)) for lam in lamRange]
mGraph2(seed1,seed2) = [mEst2(lam,MersenneTwister(seed1),
MersenneTwister(seed2)) for lam in lamRange]
argMaxLam(graph) = lamRange[findmax(graph)[2]]
= std([argMaxLam(mGraph0(seed)) for seed in 1:M])
std0 = std([argMaxLam(mGraph1(seed)) for seed in 1:M])
std1 = std([argMaxLam(mGraph2(seed,seed+M)) for seed in 1:M])
std2
println("Standard deviation with no CRN: ", std0)
println("Standard deviation with CRN and single RNG: ", std1)
println("Standard deviation with CRN and two RNGs: ", std2)
plot(lamRange,mGraph0(1987),
=:red, label="No CRN")
cplot!(lamRange,mGraph1(1987),
=:green, label="CRN and one RNG")
cplot!(lamRange,mGraph2(1987,1988),
=:blue, label="CRN and two RNG's", xlims=(0,1),ylims=(0,14),
c=L"\lambda", ylabel = "Mean") xlabel
Standard deviation with no CRN: 0.0355297898384717
Standard deviation with CRN and single RNG: 0.03366721523056788
Standard deviation with CRN and two RNGs: 0.014375822211764607
4.1 An example with Graphs
See the JuliaGraphs organization. We’ll use the Graphs.jl package.
using Graphs, Distributions, StatsBase, Random, Plots, LaTeXStrings
Random.seed!(0)
function createNetwork(edges)
= Graph(maximum(maximum.(edges)))
network for e in edges
add_edge!(network, e[1], e[2])
end
networkend
function uniformRandomEdge(network)
= length.(network.fadjlist)
outDegrees = sample(1:length(outDegrees),Weights(outDegrees))
randI = rand(network.fadjlist[randI])
randJ
randI, randJend
function networkLife(network,source,dest,lambda)
= copy(network)
failureNetwork = 0
t while has_path(failureNetwork, source, dest)
+= rand(Exponential(1/(failureNetwork.ne*lambda)))
t = uniformRandomEdge(failureNetwork)
i, j rem_edge!(failureNetwork, i, j)
end
tend
= 0.5, 1.0
lambda1, lambda2 = [(1,2), (1,3), (2,4), (2,5), (2,3), (3,4), (3,5), (4,5), (4,6), (5,6)]
roads = 1, 6
source, dest = createNetwork(roads)
network = 10^6
N
= [ networkLife(network,source,dest,lambda1) for _ in 1:N ]
failTimes1 = [ networkLife(network,source,dest,lambda2) for _ in 1:N ]
failTimes2
println("Edge Failure Rate = $(lambda1): Mean failure time = ",
mean(failTimes1), " days.")
println("Edge Failure Rate = $(lambda2): Mean failure time = ",
mean(failTimes2), " days.")
stephist(failTimes1, bins=200, c=:blue, normed=true, label=L"\lambda=0.5")
stephist!(failTimes2, bins=200, c=:red, normed=true, label=L"\lambda=1.0",
=(0,5), ylims=(0,1.1), xlabel="Network Life Time", ylabel = "Density") xlims
Edge Failure Rate = 0.5: Mean failure time = 1.4449788306592593 days.
Edge Failure Rate = 1.0: Mean failure time = 0.7233820708726806 days.
4.2 A Structured Discrete Event Simulator
Let’s look at a more complex coding example where we develop discrete event simulation engine.
using DataStructures
import Base: isless
abstract type Event end
abstract type State end
# Captures an event and the time it takes place
struct TimedEvent
::Event
event::Float64
timeend
# Comparison of two timed events - this will allow us to use them in a heap/priority-queue
isless(te1::TimedEvent, te2::TimedEvent) = te1.time < te2.time
"""
new_timed_events = process_event(time, state, event)
Process an `event` at a given `time`, which may read and write to the system `state`. An event
may generate new events, returned as an array of 0 or more new `TimedEvent`s.
"""
function process_event end # This defines a function with zero methods (to be added later)
# Generic events that we can always use
"""
EndSimEvent()
Return an event that ends the simulation.
"""
struct EndSimEvent <: Event end
function process_event(time::Float64, state::State, es_event::EndSimEvent)
println("Ending simulation at time $time.")
return []
end
"""
LogStateEvent()
Return an event that prints a log of the current simulation state.
"""
struct LogStateEvent <: Event end
function process_event(time::Float64, state::State, ls_event::LogStateEvent)
println("Logging state at time $time:")
println(state)
return []
end
process_event (generic function with 2 methods)
"""
The main simulation function gets an initial state and an initial event
that gets things going. Optional arguments are the maximal time for the
simulation, times for logging events, and a call-back function.
"""
function simulate(init_state::State, init_timed_event::TimedEvent
; ::Float64 = 10.0,
max_time::Vector{Float64} = Float64[],
log_times= (time, state) -> nothing)
callback
# The event queue
= BinaryMinHeap{TimedEvent}()
priority_queue
# Put the standard events in the queue
push!(priority_queue, init_timed_event)
push!(priority_queue, TimedEvent(EndSimEvent(), max_time))
for log_time in log_times
push!(priority_queue, TimedEvent(LogStateEvent(), log_time))
end
# initialize the state
= deepcopy(init_state)
state = 0.0
time
# Callback at simulation start
callback(time, state)
# The main discrete event simulation loop - SIMPLE!
while true
# Get the next event
= pop!(priority_queue)
timed_event
# Advance the time
= timed_event.time
time
# Act on the event
= process_event(time, state, timed_event.event)
new_timed_events
# If the event was an end of simulation then stop
if timed_event.event isa EndSimEvent
break
end
# The event may spawn 0 or more events which we put in the priority queue
for nte in new_timed_events
push!(priority_queue,nte)
end
# Callback for each simulation event
callback(time, state)
end
end;
using Distributions, Random
Random.seed!(0)
= 1.8
λ = 2.0
μ
mutable struct QueueState <: State
::Int # If ≥ 1 then server is busy, If = 0 server is idle.
number_in_systemend
struct ArrivalEvent <: Event end
struct EndOfServiceEvent <: Event end
# Process an arrival event
function process_event(time::Float64, state::State, ::ArrivalEvent)
# Increase number in system
+= 1
state.number_in_system = TimedEvent[]
new_timed_events
# Prepare next arrival
push!(new_timed_events,TimedEvent(ArrivalEvent(),time + rand(Exponential(1/λ))))
# If this is the only job on the server
== 1 && push!(new_timed_events,TimedEvent(EndOfServiceEvent(), time + 1/μ))
state.number_in_system return new_timed_events
end
# Process an end of service event
function process_event(time::Float64, state::State, ::EndOfServiceEvent)
# Release a customer from the system
-= 1
state.number_in_system @assert state.number_in_system ≥ 0
return state.number_in_system ≥ 1 ? [TimedEvent(EndOfServiceEvent(), time + 1/μ)] : TimedEvent[]
end
simulate(QueueState(0), TimedEvent(ArrivalEvent(),0.0), log_times = [5.3,7.5])
Logging state at time 5.3:
QueueState(3)
Logging state at time 7.5:
QueueState(3)
Ending simulation at time 10.0.
"""
This function is designed to stitch_steps of a discrete event curve.
"""
function stitch_steps(epochs, values)
= length(epochs)
n = [epochs[1]]
new_epochs = [values[1]]
new_values for i in 2:n
push!(new_epochs, epochs[i])
push!(new_values, values[i-1])
push!(new_epochs, epochs[i])
push!(new_values, values[i])
end
return (new_epochs, new_values)
end;
using Plots
Random.seed!(0)
= Float64[], Int[]
time_traj, queue_traj
function record_trajectory(time::Float64, state::QueueState)
push!(time_traj, time)
push!(queue_traj, state.number_in_system)
return nothing
end
simulate(QueueState(0), TimedEvent(ArrivalEvent(),0.0), max_time = 100.0, callback = record_trajectory)
plot(stitch_steps(time_traj, queue_traj)... ,
= false, xlabel = "Time", ylabel = "Queue size (number in system)" ) label
Ending simulation at time 100.0.
Random.seed!(0)
= 1.8
λ = 2.0
μ
= 0.0
prev_time = 0
prev_state = 0.0
integral
function add_to_integral(time::Float64, state::QueueState)
# Make sure to use the variables above
global prev_time, prev_state, integral
= time - prev_time
diff += prev_state * diff
integral = time
prev_time = state.number_in_system
prev_state
return nothing
end
simulate(QueueState(0), TimedEvent(ArrivalEvent(),0.0), max_time = 10.0^6, callback = add_to_integral)
println("Simulated mean queue length: ", integral / 10^6 )
= λ / μ
ρ = ρ/(1-ρ)*(2-ρ)/2
md1_theory println("Theoretical mean queue length: ", md1_theory)
Ending simulation at time 1.0e6.
Simulated mean queue length: 5.051148493439235
Theoretical mean queue length: 4.950000000000001
5 Some graphical illustrations
Here are a few more graphics/plots using Plots.jl. There is also StatsPlots.jl to keep in mind. Importantly, the other graphics side of Julia is Makie.jl with the Makie Organization. We’ll use this graphics engine in Unit 5. In particular if you like R’s ggplot, or ggplot2, then AlgebraOfGraphics.jl from the Makie world is probably the tool for you. See also this tutorial by PumasAI.
Here is a visualization of several bi-variate distributions that have the same mean vector and covariance matrix, but are different:
Random.seed!(1)
= 10^3
N
= [ 6 4 ;
SigY 4 9]
= [15 ;
muY 20]
= cholesky(SigY).L
A
= [()->rand(Normal()),
rngGens ->rand(Uniform(-sqrt(3),sqrt(3))),
()->rand(Exponential())-1]
()
rv(rg) = A*[rg(),rg()] + muY
= [[rv(r) for _ in 1:N] for r in rngGens]
data
stats(data) = begin
= first.(data),last.(data)
data1, data2 println(round(mean(data1),digits=2), "\t",round(mean(data2),digits=2),"\t",
round(var(data1),digits=2), "\t", round(var(data2),digits=2), "\t",
round(cov(data1,data2),digits=2))
end
println("Mean1\tMean2\tVar1\tVar2\tCov")
for d in data
stats(d)
end
scatter(first.(data[1]), last.(data[1]), c=:blue, ms=1, msw=0, label="Normal")
scatter!(first.(data[2]), last.(data[2]), c=:red, ms=1, msw=0, label="Uniform")
scatter!(first.(data[3]),last.(data[3]),c=:green, ms=1,msw=0,label="Exponential",
=(0,40), ylims=(0,40), legend=:bottomright, ratio=:equal,
xlims=L"X_1", ylabel=L"X_2") xlabel
Mean1 Mean2 Var1 Var2 Cov
15.02 20.02 6.19 9.71 4.22
15.06 19.86 6.21 9.28 4.18
14.9 20.05 5.75 9.74 3.86
Here is a simple way to create animations. See the docs.
using Plots
function graphCreator(n::Int)
= 1:n
vertices = [exp(2*pi*im*k/n) for k in vertices]
complexPts = [(real(p),imag(p)) for p in complexPts]
coords = first.(coords)
xPts = last.(coords)
yPts = []
edges for v in vertices, u in (v+1):n
push!(edges,(v,u))
end
= Animation()
anim scatter(xPts, yPts, c=:blue, msw=0, ratio=1,
=(-1.5,1.5), ylims=(-1.5,1.5), legend=:none)
xlims
for i in 1:length(edges)
= edges[i][1], edges[i][2]
u, v = [xPts[u], xPts[v]]
xpoints = [yPts[u], yPts[v]]
ypoints plot!(xpoints, ypoints, line=(:red))
frame(anim)
end
gif(anim, "work/graph.gif", fps = 60)
end
graphCreator(16)
[ Info: Saved animation to /work/julia-ml/Julia_ML_training/Julia_ML_training/unit3/work/graph.gif
Here are some more “touched up” graphics. Note the use of the in-built Dates library.
using HTTP, DataFrames, CSV, Statistics, Dates, Plots, Measures
= HTTP.request("GET","https://raw.githubusercontent.com/h-Klok/StatsWithJuliaBook/master/data/temperatures.csv")
resp = CSV.read(IOBuffer(String(resp.body)), DataFrame)
data
= data.Brisbane
brisbane = data.GoldCoast
goldcoast
= brisbane - goldcoast
diff = [Date(
dates Year(data.Year[i]),
Month(data.Month[i]),
Day(data.Day[i])
in 1:nrow(data)]
) for i
= 250:263
fortnightRange = brisbane[fortnightRange]
brisFortnight = goldcoast[fortnightRange]
goldFortnight
= plot(dates, [brisbane goldcoast], label=["Brisbane" "Gold Coast"],
p1 =[:blue :red], xlabel="Time", ylabel="Temperature")
c= plot(dates[fortnightRange], [brisFortnight goldFortnight], label=["Brisbane" "Gold Coast"],
p2 =[:blue :red], m=(:dot, 5, Plots.stroke(1)), xlabel="Time", ylabel="Temperature")
c= plot(dates, diff, c=:black, ylabel="Temperature Difference",legend=false)
p3 = histogram(diff, bins=-4:0.5:6,
p4 =(0,140), legend = false,
ylims="Temperature Difference", ylabel="Frequency")
xlabelplot(p1,p2,p3,p4, size = (800,500), margin = 5mm)
using LaTeXStrings, Measures
f(x,y) = x^2 + y^2
f0(x) = f(x,0)
f2(x) = f(x,2)
= -5:0.1:5 , -5:0.1:5
xVals, yVals plot(xVals, [f0.(xVals), f2.(xVals)],
=[:blue :red], xlims=(-5,5), legend=:top,
c=(-5,25), ylabel=L"f(x,\cdot)", label=[L"f(x,0)" L"f(x,2)"])
ylims= annotate!(0, -0.2, text("(0,0) The minimum\n of f(x,0)", :left, :top, 10))
p1
= [ f(x,y) for y in yVals, x in xVals ]
z = surface(xVals, yVals, z, c=cgrad([:blue, :red]),legend=:none,
p2 ="y", zlabel=L"f(x,y)")
ylabel
= z[1:10,1:10]
M = heatmap(M, c=cgrad([:blue, :red]), yflip=true, ylabel="y",
p3 =([1:10;], xVals), yticks=([1:10;], yVals))
xticks
plot(p1, p2, p3, layout=(1,3), size=(1200,400), xlabel="x", margin=5mm)
6 Probability Distributions
The Distributions package is very rich and orderly. It can be extended quite easily, see for example Copulas.jl, also discussed in Laverny and Jimenez (2024).
Here are some common continuous distributions:
= [
dists Uniform(10,20),
Exponential(3.5),
Gamma(0.5,7),
Beta(10,0.5),
Weibull(10,0.5),
Normal(20,3.5),
Rayleigh(2.4),
Cauchy(20,3.5)]
println("Distribution \t\t\t Parameters \t Support")
reshape([dists ; Distributions.params.(dists) ;
->(minimum(d),maximum(d))).(dists) ],
((d)length(dists),3)
Distribution Parameters Support
8×3 Matrix{Any}:
Distributions.Uniform{Float64}(a=10.0, b=20.0) (10.0, 20.0) (10.0, 20.0)
Distributions.Exponential{Float64}(θ=3.5) (3.5,) (0.0, Inf)
Distributions.Gamma{Float64}(α=0.5, θ=7.0) (0.5, 7.0) (0.0, Inf)
Distributions.Beta{Float64}(α=10.0, β=0.5) (10.0, 0.5) (0.0, 1.0)
Distributions.Weibull{Float64}(α=10.0, θ=0.5) (10.0, 0.5) (0.0, Inf)
Distributions.Normal{Float64}(μ=20.0, σ=3.5) (20.0, 3.5) (-Inf, Inf)
Distributions.Rayleigh{Float64}(σ=2.4) (2.4,) (0.0, Inf)
Distributions.Cauchy{Float64}(μ=20.0, σ=3.5) (20.0, 3.5) (-Inf, Inf)
Here are some common discrete distributions:
using Distributions
= [
dists DiscreteUniform(10,20),
Binomial(10,0.5),
Geometric(0.5),
NegativeBinomial(10,0.5),
Hypergeometric(30, 40, 10),
Poisson(5.5)]
println("Distribution \t\t\t\t\t\t Parameters \t Support")
reshape([dists ; Distributions.params.(dists) ;
->(minimum(d),maximum(d))).(dists) ],
((d)length(dists),3)
Distribution Parameters Support
6×3 Matrix{Any}:
Distributions.DiscreteUniform(a=10, b=20) … (10, 20) (10, 20)
Distributions.Binomial{Float64}(n=10, p=0.5) (10, 0.5) (0, 10)
Distributions.Geometric{Float64}(p=0.5) (0.5,) (0, Inf)
Distributions.NegativeBinomial{Float64}(r=10.0, p=0.5) (10.0, 0.5) (0, Inf)
Distributions.Hypergeometric(ns=30, nf=40, n=10) (30, 40, 10) (0, 10)
Distributions.Poisson{Float64}(λ=5.5) … (5.5,) (0, Inf)
Here we use the pdf
function for the hypergeometric distribution:
= 500, [450, 400, 250, 100, 50], 30
L, K, n = [Hypergeometric(k,L-k,n) for k in K]
hyperDists = 0:1:n
xGrid = [ pdf.(dist, xGrid) for dist in hyperDists ]
pmfs = "Successes = " .* string.(K)
labels
bar( xGrid, pmfs,
=0.8, c=[:orange :purple :green :red :blue ],
alpha=hcat(labels...), ylims=(0,0.25),
label="x", ylabel="Probability", legend=:top) xlabel
Here are some relationships between the beta distribution and some special functions (gamma and beta):
using SpecialFunctions
= 0.2, 0.7
a,b = 0.75
x
= beta(a,b)
betaAB1 = (gamma(a)gamma(b))/gamma(a+b)
betaAB2 = pdf(Beta(a,b),x)
betaPDFAB1 = (1/beta(a,b))*x^(a-1) * (1-x)^(b-1)
betaPDFAB2
println("beta($a,$b) = $betaAB1,\t$betaAB2")
println("betaPDF($a,$b) at $x = $betaPDFAB1,\t$betaPDFAB2")
beta(0.2,0.7) = 5.576463695849875, 5.576463695849875
betaPDF(0.2,0.7) at 0.75 = 0.34214492891381176, 0.34214492891381176
Here is a relationship between the exponential distribution and the geometric distribution:
using StatsBase
= 1, 10^6
lambda, N = 0:6
xGrid
= Exponential(1/lambda)
expDist = counts(convert.(Int,floor.(rand(expDist,N))), xGrid)/N
floorData = Geometric(1-MathConstants.e^-lambda)
geomDist
plot( xGrid, floorData,
=:stem, marker=:circle,
line=:blue, ms=10, msw=0, lw=4,
c="Floor of Exponential")
labelplot!( xGrid, pdf.(geomDist,xGrid),
=:stem, marker=:xcross,
line=:red, ms=6, msw=0, lw=2,
c="Geometric", ylims=(0,1),
label="x", ylabel="Probability") xlabel
using Distributions, Plots, LaTeXStrings
= [0.5, 1.5, 1]
alphas = 2
lam
λ_param(dist::Weibull) = shape(dist)*scale(dist)^(-shape(dist))
θ_param(lam,alpha) = (alpha/lam)^(1/alpha)
= [Weibull.(a,θ_param(lam,a)) for a in alphas]
dists
hA(dist,x) = pdf(dist,x)/ccdf(dist,x)
hB(dist,x) = λ_param(dist)*x^(shape(dist)-1)
= 0.01:0.01:10
xGrid = [hA.(d,xGrid) for d in dists]
hazardsA = [hB.(d,xGrid) for d in dists]
hazardsB
println("Maximum difference between two implementations of hazard: ",
maximum(maximum.(hazardsA-hazardsB)))
= [:blue :red :green]
Cl = [L"\lambda=" * string(λ_param(d)) * ", " * L"\alpha =" * string(shape(d))
Lb in dists]
for d
plot(xGrid, hazardsA, c=Cl, label=reshape(Lb, 1,:), xlabel="x",
="Instantaneous failure rate", xlims=(0,10), ylims=(0,10)) ylabel
Maximum difference between two implementations of hazard: 1.7763568394002505e-15
7 Automatic Differentiation
There are multiple options for automatic differenation. See also the basic example in Unit 1. Here are key packages:
- ForwardDiff.jl: Implements forward-mode automatic differentiation, well-suited for functions with few input variables and many outputs.
- ReverseDiff.jl: Provides reverse-mode (backward-mode) automatic differentiation, ideal for functions with many inputs and few outputs, such as those commonly encountered in machine learning.
- Zygote.jl: Source-to-source automatic differentiation supporting reverse-mode; widely used in the Flux ecosystem and beyond for differentiable programming.
- Enzyme.jl: High-performance AD using LLVM, supporting both forward and reverse modes, and capable of differentiating lower-level code and some external libraries.
These packages are related:
- Symbolics.jl: Symbolic computation and differentiation; useful for obtaining analytic derivatives or generating optimized derivative code.
- Calculus.jl: The Calculus package provides tools for working with the basic calculus operations of differentiation and integration. You can use the Calculus package to produce approximate derivatives by several forms of finite differencing or to produce exact derivative using symbolic differentiation.
8 Additional online resources
- The paper Roesch et al. (2023) introduces Julia within the context of the quantitative biology community.
- We did not touch the world of p robabilistic programming in Julia. See Turing tutorials.
9 Exercises
- Use the
DifferentialEquations.jl
package to solve the nonlinear ordinary differential equation
\[ \frac{dx}{dt} = x \sin(t) - t * \cos(y), \]
with the initial condition \(y(0) = 1\), over the time interval \(t = 0\) to \(t = 5\). Plot the solution \(y(t)\). - Try to reproduce the SEIR continuous time markov chain simulation using JumpProcesses.jl.
- Explore some of the
LinearAlgebra
functions (lu
,qr
svd
, andcholesky
). In each case create a suitable matrix, factor it using the function, and reconstruct the matrix from the factorization. - Use the discrete event simulation engine code provided above to simulation a system of two queues, one after another (a tandem queueing system).