[1] "/home/runner/work/gcrmn_model_dev/gcrmn_model_dev/docs"
Comparing model types - site replacement
1 Purpose
The purpose of this page is to contrast a few different approaches to modelling broad scale spatio-temporal that are suitable for both very large and small sample data sets in the context of the upcoming GCRMN report. More specifically, we have identified the following modelling challenges for which analyses of simulated data will hopefully provide model choice guidance:
Site replacements: when monitored sites are removed from a design and replaced with alternative sites. Of particular concern is the situation in which a poor site is discontinued (perhaps it becomes so degraded it no longer functions as a coral reef) and is replaced by a relatively good new site (with the argument that there is no value in establishing a new poor site that might similarly disappear in the near future). Unfortunately, this shifts the bias and has the potential to confound the timeseries.
Unsampled years: when the timeseries are punctuated with missing years. Ideally, it is important that the analyses are able to estimate trends across the full timeseries. Hence estimates are necessary, even when data are unavailable.
Use of covariates: whilst other covariates (such as sea surface temperature and cyclones could be used to inform trends at locations and times for which observed data are lacking, they also have the potential to induce new biases. For example, if models are “trained” with wave energy (or wind) data in which no major storms occurred and coral cover grew, it is possible that the model could “learn” that coral cover is positively correlated with wave energy. When predicting to a different area that did experience substantial stores (that might be expected to have a negative impact on coral cover), such a model might erroneously predict substantial increases in coral cover.
There are also additional complications of using covariates that are likely to represent acute impacts. Whilst, coral cover can decline rapidly in response to a disturbance, it typically takes time to recover (if at all) to its pre-disturbance state. As a result, the state of coral at any specific time is the result of an accumulation of conditions over the past and not just the most recent conditions. To account for this in a modelling perspective, is likely to require either:
- Inclusion of the previous years cover estimate in a model (which effectively means the model is parameterised on changes in cover). This is obviously infeasible for the first year of a monitoring program or a program without fixed sites and also becomes a proportionally bigger issue with shorter the time series.
- Inclusion of multiple lagged versions of each of the covariates that are likely to represent causal disturbances.
2 Preparations
Load the necessary R libraries
library(tidyverse) # for data manipulation and visualisation
library(sf) # for spatial data handling and visualisation
library(glmmTMB) # for frequentist GLMM
library(emmeans) # for post-hoc analysis
library(DHARMa) # for model diagnostics
library(patchwork)
library(brms) # for Bayesian GLMM
library(rstan) # for Bayesian modelling
library(bayesplot) # for Bayesian plotting
library(tidybayes) # for tidy bayesian analysis
library(posterior) # for posterior analysis
library(gbm) # for gradient boosted regression trees
library(dbarts) # for Bayesian additive regression trees
library(HDInterval) # for Bayesian credible intervals
Establish the global environment and paths
assign(x = "data_path", value = "../data/", envir = .GlobalEnv)
assign(x = "output_path", value = "../output/", envir = .GlobalEnv)
paths <- list(
data_path = data_path,
synthetic_path = paste0(data_path, "synthetic/"),
output_path = output_path,
fig_path = paste0(output_path, "figures/")
)
lapply(paths, function(x) {
if (!dir.exists(x)) {
dir.create(x)
}
})
Load any helper functions
# Assign a variable to the global namespace
data_path = "../data/"
output_path = "../output/"
# Create a dictionary of paths
paths = {
"data_path": data_path,
"synthetic_path": f"{data_path}synthetic/",
"fig_path": f"{output_path}figures/"
}
# Ensure the directories exist
for path in paths.values():
if not os.path.exists(path):
os.makedirs(path)
3 Import data
These data represent the synthesized “truth” and are thereby useful to compare to the various modelled outcomes.
Simple feature collection with 267132 features and 8 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 0.1086254 ymin: -20.96124 xmax: 9.508625 ymax: -9.981244
Geodetic CRS: WGS 84
# A tibble: 267,132 × 9
Year Reef geometry HCC SC MA CYC DHW OTHER
<int> <chr> <POINT [°]> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 Reef1 (9.398625 -19.95124) 0.310 0.0750 0.565 0 0.399 0.512
2 2 Reef1 (9.398625 -19.95124) 0.270 0.0627 0.592 0.570 0.379 0.745
3 3 Reef1 (9.398625 -19.95124) 0.279 0.0652 0.586 0 0.417 0.486
4 4 Reef1 (9.398625 -19.95124) 0.291 0.0691 0.578 0.0180 0.340 0.615
5 5 Reef1 (9.398625 -19.95124) 0.301 0.0722 0.571 0 0.419 0.429
6 6 Reef1 (9.398625 -19.95124) 0.316 0.0769 0.561 0 0.318 0.728
7 7 Reef1 (9.398625 -19.95124) 0.328 0.0811 0.553 0 0.355 0.648
8 8 Reef1 (9.398625 -19.95124) 0.309 0.0748 0.565 0 0.637 0.691
9 9 Reef1 (9.398625 -19.95124) 0.261 0.0599 0.599 0 0.961 0.575
10 10 Reef1 (9.398625 -19.95124) 0.246 0.0557 0.609 0 0.650 0.519
# ℹ 267,122 more rows
Show the code
benthos_reefs_temporal_summary <- benthos_reefs_sf |>
st_drop_geometry() |>
group_by(Year) |>
summarise(Mean = mean(HCC),
Median = median(HCC),
SD = sd(HCC),
Lower = quantile(HCC, 0.025),
Upper = quantile(HCC, 0.975))
benthos_reefs_temporal_summary |>
saveRDS(
file = paste0(
data_path,
"synthetic/benthos_reefs_temporal_summary.rds"
)
)
Show the code
benthos_reefs_temporal_summary <-
readRDS(
file = paste0(
data_path,
"synthetic/benthos_reefs_temporal_summary.rds"
)
)
g <- benthos_reefs_temporal_summary |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = Lower, ymax = Upper), alpha = 0.2) +
geom_line(aes(x = Year, y = Mean, colour = "mean")) +
geom_line(aes(x = Year, y = Median, colour = "median")) +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_all_temporal_summary_plot.png"
),
g,
width = 8, height = 6, dpi = 72
)
These data represent a simulated sampling design in which 25 fixed (though randomly selected from among all the defined reefs in the synthetic spatial domain) reefs are monitored over a continuous 12 year period.
benthos_fixed_locs_obs <- readRDS(file = paste0(
data_path,
"synthetic/benthos_fixed_locs_obs.rds"
))
benthos_fixed_locs_obs <- benthos_fixed_locs_obs |>
left_join(
benthos_reefs_sf |>
st_drop_geometry() |>
dplyr::select(Year, Reef, CYC, DHW, OTHER) |>
group_by(Year, Reef) |>
summarise(across(c(CYC, DHW, OTHER), mean)),
by = c("Year", "Reef")
)
`summarise()` has grouped output by 'Year'. You can override using the
`.groups` argument.
# A tibble: 3,000 × 13
Reef Longitude Latitude Site Transect Year HCC SC MA
<chr> <dbl> <dbl> <chr> <chr> <int> <dbl> <dbl> <dbl>
1 Reef103 3.66 -19.5 S1 T1 1 18.6 3.99 54.9
2 Reef103 3.66 -19.5 S1 T1 2 19.0 4.32 50.5
3 Reef103 3.66 -19.5 S1 T1 3 19.0 4.18 48.7
4 Reef103 3.66 -19.5 S1 T1 4 21.6 4.81 50.2
5 Reef103 3.66 -19.5 S1 T1 5 25.4 5.20 49.0
6 Reef103 3.66 -19.5 S1 T1 6 21.2 5.47 46.3
7 Reef103 3.66 -19.5 S1 T1 7 22.2 5.26 45.6
8 Reef103 3.66 -19.5 S1 T1 8 23.2 5.18 46.1
9 Reef103 3.66 -19.5 S1 T1 9 19.7 4.43 47.2
10 Reef103 3.66 -19.5 S1 T1 10 17.4 4.09 54.4
# ℹ 2,990 more rows
# ℹ 4 more variables: Date <dttm>, CYC <dbl>, DHW <dbl>, OTHER <dbl>
Show the code
benthos_fixed_locs_obs <- benthos_fixed_locs_obs |>
mutate(
fYear = as.factor(Year),
Reef = as.factor(Reef),
Site = interaction(Reef, Site),
Transect = interaction(Site, Transect),
cover = HCC/100
)
print(benthos_fixed_locs_obs)
## Raw summary means
all_sampled_sum <-
benthos_fixed_locs_obs |>
group_by(fYear) |>
summarise(
mean_mean_response = mean(cover),
median_median_response = median(cover),
mean_sd = sd(cover),
mean_lower_lower = mean_mean_response - 1.96 * mean_sd,
mean_upper_upper = mean_mean_response + 1.96 * mean_sd,
median_lower_lower = quantile(cover, 0.025),
median_upper_upper = quantile(cover, 0.975)
) |>
dplyr::select(-mean_sd) |>
pivot_longer(
cols = c(
mean_mean_response, median_median_response, mean_lower_lower,
mean_upper_upper, median_lower_lower, median_upper_upper
),
names_to = c("type", "variable", "stat"),
names_sep = "_",
values_to = "values"
) |>
mutate(type = paste("all_sampled_", type)) |>
dplyr::select(-variable) |>
pivot_wider(
names_from = stat,
values_from = values
) |>
mutate(Year = as.numeric(as.character(fYear)))
print(all_sampled_sum)
Show the code
g <- all_sampled_sum |>
ggplot(aes(x = Year, y = response, color = type, fill = type)) +
geom_line() +
geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0.2, color = NA) +
scale_y_continuous(
name = "Hard Coral Cover (%)",
limits = c(0, 1),
labels = scales::percent_format(accuracy = 1)
) +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_full_simple_raw_means_plot.png"
),
g,
width = 8, height = 6, dpi = 72
)
These data represent the fixed 25 sites, yet the site with the lowest coral cover is discontinued after 9 years and the highest performing reef is only introduced after 9 years. In this way, these data are intended to mimic the scenario in which a poor reef is replaced by a relatively good reef.
benthos_fixed_locs_obs_1 <- readRDS(
file = paste0(
data_path,
"synthetic/benthos_fixed_locs_obs_1.rds"
)
)
benthos_fixed_locs_obs_1 <- benthos_fixed_locs_obs_1 |>
left_join(
benthos_reefs_sf |>
st_drop_geometry() |>
dplyr::select(Year, Reef, CYC, DHW, OTHER) |>
group_by(Year, Reef) |>
summarise(across(c(CYC, DHW, OTHER), mean)),
by = c("Year", "Reef")
)
`summarise()` has grouped output by 'Year'. You can override using the
`.groups` argument.
# A tibble: 2,880 × 13
Reef Longitude Latitude Site Transect Year HCC SC MA
<chr> <dbl> <dbl> <chr> <chr> <int> <dbl> <dbl> <dbl>
1 Reef103 3.66 -19.5 S1 T1 1 18.6 3.99 54.9
2 Reef103 3.66 -19.5 S1 T1 2 19.0 4.32 50.5
3 Reef103 3.66 -19.5 S1 T1 3 19.0 4.18 48.7
4 Reef103 3.66 -19.5 S1 T1 4 21.6 4.81 50.2
5 Reef103 3.66 -19.5 S1 T1 5 25.4 5.20 49.0
6 Reef103 3.66 -19.5 S1 T1 6 21.2 5.47 46.3
7 Reef103 3.66 -19.5 S1 T1 7 22.2 5.26 45.6
8 Reef103 3.66 -19.5 S1 T1 8 23.2 5.18 46.1
9 Reef103 3.66 -19.5 S1 T1 9 19.7 4.43 47.2
10 Reef103 3.66 -19.5 S1 T1 10 17.4 4.09 54.4
# ℹ 2,870 more rows
# ℹ 4 more variables: Date <dttm>, CYC <dbl>, DHW <dbl>, OTHER <dbl>
These data represent a more extreme version of the above reef replacement situation in that there are only a total of five reefs and thus the replacement represents a higher fraction of reefs.
benthos_fixed_locs_obs_2 <- readRDS(
file = paste0(
data_path,
"synthetic/benthos_fixed_locs_obs_2.rds"
)
)
benthos_fixed_locs_obs_2 <- benthos_fixed_locs_obs_2|>
left_join(
benthos_reefs_sf |>
st_drop_geometry() |>
dplyr::select(Year, Reef, CYC, DHW, OTHER) |>
group_by(Year, Reef) |>
summarise(across(c(CYC, DHW, OTHER), mean)),
by = c("Year", "Reef")
)
`summarise()` has grouped output by 'Year'. You can override using the
`.groups` argument.
# A tibble: 480 × 13
Reef Longitude Latitude Site Transect Year HCC SC MA
<chr> <dbl> <dbl> <chr> <chr> <int> <dbl> <dbl> <dbl>
1 Reef170 1.34 -20.9 S1 T1 1 13.0 3.41 55.0
2 Reef170 1.34 -20.9 S1 T1 2 13.8 3.53 55.0
3 Reef170 1.34 -20.9 S1 T1 3 15.6 3.66 55.6
4 Reef170 1.34 -20.9 S1 T1 4 16.8 4.02 54.3
5 Reef170 1.34 -20.9 S1 T1 5 16.6 4.24 52.6
6 Reef170 1.34 -20.9 S1 T1 6 18.3 4.69 58.3
7 Reef170 1.34 -20.9 S1 T1 7 17.6 4.66 53.6
8 Reef170 1.34 -20.9 S1 T1 8 15.3 4.32 54.6
9 Reef170 1.34 -20.9 S1 T1 9 14.1 3.74 51.3
10 Reef170 1.34 -20.9 S1 T2 1 14.5 3.33 54.8
# ℹ 470 more rows
# ℹ 4 more variables: Date <dttm>, CYC <dbl>, DHW <dbl>, OTHER <dbl>
These data represent the synthesized “truth” and are thereby useful to compare to the various modelled outcomes.
Year Reef geometry HCC \
0 1 Reef1 c(9.398625412978, -19.9512440470767) 0.309773
1 2 Reef1 c(9.398625412978, -19.9512440470767) 0.270281
2 3 Reef1 c(9.398625412978, -19.9512440470767) 0.278780
3 4 Reef1 c(9.398625412978, -19.9512440470767) 0.291293
4 5 Reef1 c(9.398625412978, -19.9512440470767) 0.301213
... ... ... ... ...
267127 8 Reef187 c(1.478625412978, -14.1412440470767) 0.194059
267128 9 Reef187 c(1.478625412978, -14.1412440470767) 0.172821
267129 10 Reef187 c(1.478625412978, -14.1412440470767) 0.157513
267130 11 Reef187 c(1.478625412978, -14.1412440470767) 0.167410
267131 12 Reef187 c(1.478625412978, -14.1412440470767) 0.178224
SC MA CYC DHW OTHER
0 0.074957 0.565184 0.000000 0.398843 0.511869
1 0.062682 0.592401 0.570183 0.378864 0.745148
2 0.065237 0.586457 0.000000 0.417448 0.485992
3 0.069083 0.577790 0.018019 0.339762 0.615030
4 0.072207 0.570994 0.000000 0.419087 0.428713
... ... ... ... ... ...
267127 0.022101 0.628042 0.000000 0.453441 0.828005
267128 0.019233 0.646412 0.000000 0.766604 0.586145
267129 0.017246 0.659733 0.000000 0.701376 0.603980
267130 0.018523 0.651113 0.000000 0.321212 0.666380
267131 0.019950 0.641727 0.000000 0.351262 0.487009
[267132 rows x 9 columns]
Show the code
# Create the plot
plt.figure(figsize=(10, 6))
# Add ribbon (shaded area)
plt.fill_between(
benthos_reefs_temporal_summary["Year"],
benthos_reefs_temporal_summary["Lower"],
benthos_reefs_temporal_summary["Upper"],
alpha=0.2,
label="Confidence Interval"
)
# Add mean line
plt.plot(
benthos_reefs_temporal_summary["Year"],
benthos_reefs_temporal_summary["Mean"],
label="Mean",
color="blue"
)
# Add median line
plt.plot(
benthos_reefs_temporal_summary["Year"],
benthos_reefs_temporal_summary["Median"],
label="Median",
color="orange"
)
# Add labels and theme
plt.xlabel("Year")
plt.ylabel("Value")
plt.title("Temporal Summary")
plt.legend()
plt.grid(True)
# plt.show()
# Save the plot as a PNG file
plt.savefig(f"{paths['fig_path']}Python_all_temporal_summary_plot.png", dpi=300, bbox_inches="tight")
These data represent a simulated sampling design in which 25 fixed (though randomly selected from among all the defined reefs in the synthetic spatial domain) reefs are monitored over a continuous 12 year period.
benthos_fixed_locs_obs = pd.read_csv(f"{paths['data_path']}synthetic/benthos_fixed_locs_obs.csv")
benthos_reefs_sf = pd.read_csv(f"{paths['data_path']}synthetic/benthos_reefs_sf.csv")
# Group and summarize the benthos_reefs_sf DataFrame
benthos_reefs_summary = (
benthos_reefs_sf
.drop(columns=["geometry"]) # Equivalent to st_drop_geometry()
.groupby(["Year", "Reef"], as_index=False)
.agg({
"CYC": "mean",
"DHW": "mean",
"OTHER": "mean"
})
)
# Perform the left join
benthos_fixed_locs_obs = benthos_fixed_locs_obs.merge(
benthos_reefs_summary,
on=["Year", "Reef"],
how="left"
)
Reef Longitude Latitude Site Transect Year HCC SC \
0 Reef103 3.658625 -19.531244 S1 T1 1 18.625436 3.985485
1 Reef103 3.658625 -19.531244 S1 T1 2 18.974974 4.319049
2 Reef103 3.658625 -19.531244 S1 T1 3 19.045657 4.176435
3 Reef103 3.658625 -19.531244 S1 T1 4 21.643148 4.812177
4 Reef103 3.658625 -19.531244 S1 T1 5 25.442402 5.198853
... ... ... ... ... ... ... ... ...
2995 Reef99 3.758625 -13.031244 S2 T5 8 22.853000 2.448697
2996 Reef99 3.758625 -13.031244 S2 T5 9 21.445286 2.063128
2997 Reef99 3.758625 -13.031244 S2 T5 10 21.156792 1.837525
2998 Reef99 3.758625 -13.031244 S2 T5 11 18.795713 1.968396
2999 Reef99 3.758625 -13.031244 S2 T5 12 21.955909 2.082333
MA Date CYC DHW OTHER
0 54.913906 1-01-01T14:00:00Z 0.0 0.411396 0.345324
1 50.489073 2-01-01T14:00:00Z 0.0 0.383427 0.500119
2 48.694751 3-01-01T14:00:00Z 0.0 0.527733 0.561641
3 50.156755 4-01-01T14:00:00Z 0.0 0.192189 0.416467
4 48.970873 5-01-01T14:00:00Z 0.0 0.349549 0.587900
... ... ... ... ... ...
2995 55.811516 8-01-01T14:00:00Z 0.0 0.482534 0.349079
2996 56.844522 9-01-01T14:00:00Z 0.0 0.864747 0.393180
2997 59.206188 10-01-01T14:00:00Z 0.0 0.773226 0.368701
2998 58.821929 11-01-01T14:00:00Z 0.0 0.371080 0.558885
2999 57.269170 12-01-01T14:00:00Z 0.0 0.385996 0.491746
[3000 rows x 13 columns]
Show the code
benthos_fixed_locs_obs = benthos_fixed_locs_obs.assign(
fYear = benthos_fixed_locs_obs['Year'].astype('category'),
Reef = benthos_fixed_locs_obs['Reef'].astype('category'),
Site = benthos_fixed_locs_obs['Reef'].astype(str) + "_" +
benthos_fixed_locs_obs['Site'].astype(str),
Transect = benthos_fixed_locs_obs['Site'] + "_" +
benthos_fixed_locs_obs['Transect'].astype(str),
cover = benthos_fixed_locs_obs['HCC'] / 100
)
all_sampled_sum = (
benthos_fixed_locs_obs.groupby("fYear").agg(
mean_mean_response = ("cover", "mean"),
median_median_response = ("cover", "median"),
mean_sd = ("cover", "std"),
median_lower_lower = ("cover", lambda x: x.quantile(0.025)),
median_upper_upper = ("cover", lambda x: x.quantile(0.975))
).reset_index()
)
# Calculate additional columns
all_sampled_sum['mean_lower_lower'] = all_sampled_sum['mean_mean_response'] - 1.96 * all_sampled_sum['mean_sd']
all_sampled_sum['mean_upper_upper'] = all_sampled_sum['mean_mean_response'] + 1.96 * all_sampled_sum['mean_sd']
# Drop the 'mean_sd' column
all_sampled_sum = all_sampled_sum.drop(columns=['mean_sd'])
# Pivot longer
all_sampled_sum = all_sampled_sum.melt(
id_vars=['fYear'],
value_vars=[
'mean_mean_response', 'median_median_response', 'mean_lower_lower',
'mean_upper_upper', 'median_lower_lower', 'median_upper_upper'
],
var_name='names',
value_name='values'
)
# Split the 'names' column into 'type', 'variable', and 'stat'
all_sampled_sum[['type', 'variable', 'stat']] = all_sampled_sum['names'].str.split('_', expand=True)
# Modify the 'type' column
all_sampled_sum['type'] = 'all_sampled_' + all_sampled_sum['type']
# Drop the 'variable' column
all_sampled_sum = all_sampled_sum.drop(columns=['variable', 'names'])
# Pivot wider
all_sampled_sum = all_sampled_sum.pivot(
index=['fYear', 'type'],
columns='stat',
values='values'
).reset_index()
# Convert 'fYear' to numeric
all_sampled_sum['Year'] = pd.to_numeric(all_sampled_sum['fYear'])
all_sampled_sum
Show the code
# Create the plot
plt.figure(figsize=(10, 6))
# Add ribbon (shaded area)
for _, group in all_sampled_sum.groupby('type'):
plt.fill_between(
group['Year'],
group['lower'],
group['upper'],
alpha=0.2,
label=_
)
# Add mean line
for _, group in all_sampled_sum.groupby('type'):
plt.plot(
group["Year"],
group["response"],
label=_
)
# Customize the y-axis
plt.ylim(0, 1)
plt.ylabel("Hard Coral Cover (%)")
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f"{x*100:.0f}%"))
# Add theme and legend
plt.grid(True, which='both', linestyle='--', linewidth=0.5)
plt.legend(title='Type')
plt.title("Hard Coral Cover Over Time")
plt.tight_layout()
# Show the plot
#plt.show()
# Save the plot as a PNG file
plt.savefig(f"{paths['fig_path']}Python_full_simple_raw_means_plot.png",
dpi=72, bbox_inches="tight")
These data represent the fixed 25 sites, yet the site with the lowest coral cover is discontinued after 9 years and the highest performing reef is only introduced after 9 years. In this way, these data are intended to mimic the scenario in which a poor reef is replaced by a relatively good reef.
benthos_fixed_locs_obs_1 = pd.read_csv(f"{paths['data_path']}synthetic/benthos_fixed_locs_obs_1.csv")
benthos_reefs_sf = pd.read_csv(f"{paths['data_path']}synthetic/benthos_reefs_sf.csv")
# Group and summarize the benthos_reefs_sf DataFrame
benthos_reefs_summary = (
benthos_reefs_sf
.drop(columns=["geometry"]) # Equivalent to st_drop_geometry()
.groupby(["Year", "Reef"], as_index=False)
.agg({
"CYC": "mean",
"DHW": "mean",
"OTHER": "mean"
})
)
# Perform the left join
benthos_fixed_locs_obs_1 = benthos_fixed_locs_obs_1.merge(
benthos_reefs_summary,
on=["Year", "Reef"],
how="left"
)
Reef Longitude Latitude Site Transect Year HCC SC \
0 Reef103 3.658625 -19.531244 S1 T1 1 18.625436 3.985485
1 Reef103 3.658625 -19.531244 S1 T1 2 18.974974 4.319049
2 Reef103 3.658625 -19.531244 S1 T1 3 19.045657 4.176435
3 Reef103 3.658625 -19.531244 S1 T1 4 21.643148 4.812177
4 Reef103 3.658625 -19.531244 S1 T1 5 25.442402 5.198853
... ... ... ... ... ... ... ... ...
2875 Reef99 3.758625 -13.031244 S2 T5 8 22.853000 2.448697
2876 Reef99 3.758625 -13.031244 S2 T5 9 21.445286 2.063128
2877 Reef99 3.758625 -13.031244 S2 T5 10 21.156792 1.837525
2878 Reef99 3.758625 -13.031244 S2 T5 11 18.795713 1.968396
2879 Reef99 3.758625 -13.031244 S2 T5 12 21.955909 2.082333
MA Date CYC DHW OTHER
0 54.913906 1-01-01T14:00:00Z 0.0 0.411396 0.345324
1 50.489073 2-01-01T14:00:00Z 0.0 0.383427 0.500119
2 48.694751 3-01-01T14:00:00Z 0.0 0.527733 0.561641
3 50.156755 4-01-01T14:00:00Z 0.0 0.192189 0.416467
4 48.970873 5-01-01T14:00:00Z 0.0 0.349549 0.587900
... ... ... ... ... ...
2875 55.811516 8-01-01T14:00:00Z 0.0 0.482534 0.349079
2876 56.844522 9-01-01T14:00:00Z 0.0 0.864747 0.393180
2877 59.206188 10-01-01T14:00:00Z 0.0 0.773226 0.368701
2878 58.821929 11-01-01T14:00:00Z 0.0 0.371080 0.558885
2879 57.269170 12-01-01T14:00:00Z 0.0 0.385996 0.491746
[2880 rows x 13 columns]
Show the code
# Create the plot
g = sns.FacetGrid(
data=benthos_fixed_locs_obs_1,
col="Reef",
col_wrap=5, # Adjust the number of columns in the facet grid
height=2,
aspect=1.6,
sharey=True,
sharex=True
)
# Add line plots to each facet
g.map_dataframe(
sns.lineplot,
x="Year",
y="HCC",
hue="Site",
style="Transect",
estimator=None,
units="Transect",
legend=False
)
# Customize the plot
g.set_axis_labels("Year", "Hard Coral Cover (HCC)")
g.set_titles(col_template="{col_name}")
g.add_legend(title="Site")
g.fig.tight_layout()
# Show the plot
#plt.show()
# Save the plot as a PNG file
plt.savefig(f"{paths['fig_path']}Python_sampled_reefs_1_plot.png", dpi=72, bbox_inches="tight")
These data represent a more extreme version of the above reef replacement situation in that there are only a total of five reefs and thus the replacement represents a higher fraction of reefs.
benthos_fixed_locs_obs_2 = pd.read_csv(f"{paths['data_path']}synthetic/benthos_fixed_locs_obs_2.csv")
benthos_reefs_sf = pd.read_csv(f"{paths['data_path']}synthetic/benthos_reefs_sf.csv")
# Group and summarize the benthos_reefs_sf DataFrame
benthos_reefs_summary = (
benthos_reefs_sf
.drop(columns=["geometry"]) # Equivalent to st_drop_geometry()
.groupby(["Year", "Reef"], as_index=False)
.agg({
"CYC": "mean",
"DHW": "mean",
"OTHER": "mean"
})
)
# Perform the left join
benthos_fixed_locs_obs_2 = benthos_fixed_locs_obs_2.merge(
benthos_reefs_summary,
on=["Year", "Reef"],
how="left"
)
Reef Longitude Latitude Site Transect Year HCC SC \
0 Reef170 1.338625 -20.901244 S1 T1 1 13.027202 3.413521
1 Reef170 1.338625 -20.901244 S1 T1 2 13.841143 3.526345
2 Reef170 1.338625 -20.901244 S1 T1 3 15.559364 3.658558
3 Reef170 1.338625 -20.901244 S1 T1 4 16.838991 4.017966
4 Reef170 1.338625 -20.901244 S1 T1 5 16.595779 4.244544
.. ... ... ... ... ... ... ... ...
475 Reef90 4.038625 -14.171244 S2 T5 8 24.022636 3.109348
476 Reef90 4.038625 -14.171244 S2 T5 9 21.299862 2.709066
477 Reef90 4.038625 -14.171244 S2 T5 10 22.066287 2.528670
478 Reef90 4.038625 -14.171244 S2 T5 11 21.524609 2.581125
479 Reef90 4.038625 -14.171244 S2 T5 12 21.649099 2.688958
MA Date CYC DHW OTHER
0 55.028757 1-01-01T14:00:00Z 0.0 0.474864 0.349540
1 55.015181 2-01-01T14:00:00Z 0.0 0.452288 0.386229
2 55.572046 3-01-01T14:00:00Z 0.0 0.444938 0.490269
3 54.276855 4-01-01T14:00:00Z 0.0 0.304818 0.595595
4 52.587258 5-01-01T14:00:00Z 0.0 0.409084 0.492792
.. ... ... ... ... ...
475 54.103665 8-01-01T14:00:00Z 0.0 0.525576 0.329536
476 56.139564 9-01-01T14:00:00Z 0.0 0.785465 0.509434
477 59.131197 10-01-01T14:00:00Z 0.0 0.650841 0.488150
478 56.729758 11-01-01T14:00:00Z 0.0 0.400474 0.677549
479 58.822425 12-01-01T14:00:00Z 0.0 0.407944 0.519343
[480 rows x 13 columns]
Show the code
# Create the plot
g = sns.FacetGrid(
data=benthos_fixed_locs_obs_2,
col="Reef",
col_wrap=3, # Adjust the number of columns in the facet grid
height=3,
aspect=1.6,
sharey=True,
sharex=True
)
# Add line plots to each facet
g.map_dataframe(
sns.lineplot,
x="Year",
y="HCC",
hue="Site",
style="Transect",
estimator=None,
units="Transect",
legend=False
)
# Customize the plot
g.set_axis_labels("Year", "Hard Coral Cover (HCC)")
g.set_titles(col_template="{col_name}")
g.add_legend(title="Site")
g.fig.tight_layout()
# Show the plot
#plt.show()
# Save the plot as a PNG file
plt.savefig(f"{paths['fig_path']}Python_sampled_reefs_2_plot.png", dpi=72, bbox_inches="tight")
4 Data preparations
Perform the following data preparation steps:
- specifically declare any categorical variables (as factors in R, categories in python)
- ensure that all levels of hierarchical (varying) effects have unique IDs
- express cover as a proportion out of 1 as is necessary for modelling against a beta distribution
- create a prediction grid containing all years in the monitoring range
benthos_fixed_locs_obs_0 = (
benthos_fixed_locs_obs.assign(
fYear=lambda df: df["Year"].astype("category"),
Reef=lambda df: df["Reef"].astype("category"),
Site=lambda df: (df["Reef"].astype("str") + "_" +
df["Site"]).astype("category"), # Interaction of Reef and Site
Transect=lambda df: (df["Site"].astype("str") + "_" +
df["Transect"]).astype("category"), # Interaction of Site and Transect
cover=lambda df: df["HCC"] / 100 # Calculate cover
)
)
benthos_fixed_locs_obs_1 = (
benthos_fixed_locs_obs_1.assign(
fYear=lambda df: df["Year"].astype("category"),
Reef=lambda df: df["Reef"].astype("category"),
Site=lambda df: (df["Reef"].astype("str") + "_" +
df["Site"]).astype("category"), # Interaction of Reef and Site
Transect=lambda df: (df["Site"].astype("str") + "_" +
df["Transect"]).astype("category"), # Interaction of Site and Transect
cover=lambda df: df["HCC"] / 100 # Calculate cover
)
)
benthos_fixed_locs_obs_2 = (
benthos_fixed_locs_obs_2.assign(
fYear=lambda df: df["Year"].astype("category"),
Reef=lambda df: df["Reef"].astype("category"),
Site=lambda df: (df["Reef"].astype("str") + "_" +
df["Site"]).astype("category"), # Interaction of Reef and Site
Transect=lambda df: (df["Site"].astype("str") + "_" +
df["Transect"]).astype("category"), # Interaction of Site and Transect
cover=lambda df: df["HCC"] / 100 # Calculate cover
)
)
5 Modelled trends
The following table lists datasets that are used in the comparison figures along with their respective legend entries.
Dataset | Description | Legend label(s) |
---|---|---|
benthos_reefs_sf |
All reef-level data | |
benthos_reefs_temporal_summary |
All global-level data | all mean / all median |
all_sampled_sum |
Simple stats of all sampled data (25 sites) | all_sampled_mean / all_sampled_median |
benthos_fixed_locs_obs_0 |
All sampled data (25 sites) | |
mod_glmmTMB_0 |
glmmTMB model of all sampled data (25 sites) | |
glmmTMB_0_sum |
Estimated marginal means from all sampled data (25 sites) | glmmTMB |
mod_brms_0 |
brms model of all sampled data (25 sites) | |
brms_0_sum |
Estimated marginal means from all sampled data (25 sites) | brms |
mod_stan_0 |
stan model of all sampled data (25 sites) | |
stan_0_sum |
Estimated marginal means from all sampled data (25 sites) | stan |
mod_gbm_0 |
gbm model of all sampled data (25 sites) | |
gbm_0_sum |
Estimated predictions from all sampled data (25 sites) | gbm |
mod_gbm_0b |
gbm model with covariates of all sampled data (25 sites) | |
gbm_0b_sum |
Estimated predictions with covariates from all sampled data (25 sites) | gbm |
gbm_0c_sum |
Estimated predictions with covariates from all sampled data (all sites) | gbm |
mod_dbarts_0 |
bart model with covariates of all sampled ata (25 sites) | |
dbarts_0_sum |
Estimated predictions from all sampled data (25 sites) | dbarts |
mod_dbarts_0b |
bart model with covariates of all sampled ata (25 sites) | |
dbarts_0b_sum |
Estimated predictions with covariates from all sampled data (25 sites) | dbarts |
dbarts_0c_sum |
Estimated predictions with covariates from all data (all sites) | dbarts |
benthos_fixed_locs_obs_1 |
Replaced sampled data (25 sites) | |
mod_glmmTMB_1 |
glmmTMB model of replaced sampled data (25 sites) | |
glmmTMB_1_sum |
Estimated marginal means from replaced sampled data (25 sites) | glmmTMB |
mod_brms_1 |
brms model of replaced sampled data (25 sites) | |
brms_1_sum |
Estimated marginal means from replaced sampled data (25 sites) | brms |
mod_stan_1 |
stan model of replaced sampled data (25 sites) | |
stan_1_sum |
Estimated marginal means from replaced sampled data (25 sites) | stan |
mod_gbm_1 |
gbm model of replaced sampled data (25 sites) | |
gbm_1_sum |
Estimated predictions from replaced sampled data (25 sites) | gbm |
mod_gbm_1b |
gbm model with covariates of replaced sampled data (25 sites) | |
gbm_1b_sum |
Estimated predictions with covariates from replaced sampled data (25 sites) | gbm |
gbm_1c_sum |
Estimated predictions with covariates from replaced sampled data (replaced sites) | gbm |
mod_dbarts_1 |
bart model with covariates of replaced sampled ata (25 sites) | |
dbarts_1_sum |
Estimated predictions from replaced sampled data (25 sites) | dbarts |
mod_dbarts_1b |
bart model with covariates of replaced sampled ata (25 sites) | |
dbarts_1b_sum |
Estimated predictions with covariates from replaced sampled data (25 sites) | dbarts |
dbarts_1c_sum |
Estimated predictions with covariates from replaced data (all sites) | dbarts |
benthos_fixed_locs_obs_2 |
Replaced sampled data (5 sites) | |
mod_glmmTMB_2 |
glmmTMB model of replaced sampled data (5 sites) | |
glmmTMB_2_sum |
Estimated marginal means from replaced sampled data (5 sites) | glmmTMB |
mod_brms_2 |
brms model of replaced sampled data (5 sites) | |
brms_2_sum |
Estimated marginal means from replaced sampled data (5 sites) | brms |
mod_stan_2 |
stan model of replaced sampled data (5 sites) | |
stan_2_sum |
Estimated marginal means from replaced sampled data (5 sites) | stan |
mod_gbm_2 |
gbm model of replaced sampled data (5 sites) | |
gbm_2_sum |
Estimated predictions from replaced sampled data (5 sites) | gbm |
mod_gbm_2b |
gbm model with covariates of replaced sampled data (5 sites) | |
gbm_2b_sum |
Estimated predictions with covariates from replaced sampled data (5 sites) | gbm |
gbm_2c_sum |
Estimated predictions with covariates from replaced sampled data (replaced sites) | gbm |
mod_dbarts_2 |
bart model with covariates of replaced sampled ata (5 sites) | |
dbarts_2_sum |
Estimated predictions from replaced sampled data (5 sites) | dbarts |
mod_dbarts_2b |
bart model with covariates of replaced sampled ata (5 sites) | |
dbarts_2b_sum |
Estimated predictions with covariates from replaced sampled data (5 sites) | dbarts |
dbarts_2c_sum |
Estimated predictions with covariates from replaced data (all sites) | dbarts |
Model | Description |
---|---|
Simple | Simple hierarchical means/medians |
Show the code
glmmTMB_0_sum <-
mod_glmmTMB_0 |> emmeans(~fYear, at = newdata_0, type = "response") |>
as.data.frame() |>
rename(median = response, lower = asymp.LCL, upper = asymp.UCL) |>
mutate(Year = as.numeric(as.character(fYear))) |>
mutate(type = "glmmTMB")
saveRDS(glmmTMB_0_sum,
file = paste0(data_path, "synthetic/glmmTMB_0_sum.rds")
)
Show the code
g <- glmmTMB_0_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "glmmTMB")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_glmmTMB_0.png"
),
g,
width = 8, height = 6, dpi = 72
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- glmmTMB modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
Show the code
glmmTMB_0_dharma <- readRDS(
file = paste0(data_path, "synthetic/glmmTMB_0_dharma.rds")
)
g <- wrap_elements(~testUniformity(glmmTMB_0_dharma)) +
wrap_elements(~plotResiduals(glmmTMB_0_dharma)) +
wrap_elements(~testDispersion(glmmTMB_0_dharma))
ggsave(
filename = paste0(
fig_path, "R_dharma_mod_glmmTMB_0.png"
),
g,
width = 10, height = 4, dpi = 72
)
Show the code
Family: beta ( logit )
Formula: cover ~ fYear + (1 | Site) + (1 | Transect)
Data: benthos_fixed_locs_obs_0
AIC BIC logLik -2*log(L) df.resid
-13130.3 -13040.2 6580.2 -13160.3 2985
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Site (Intercept) 0.08491 0.2914
Transect (Intercept) 0.02920 0.1709
Number of obs: 3000, groups: Site, 50; Transect, 250
Dispersion parameter for beta family (): 320
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.26554 0.04346 -29.122 < 2e-16 ***
fYear2 0.01799 0.01205 1.493 0.1353
fYear3 0.04983 0.01199 4.154 3.26e-05 ***
fYear4 0.10552 0.01192 8.853 < 2e-16 ***
fYear5 0.14555 0.01186 12.275 < 2e-16 ***
fYear6 0.18332 0.01180 15.539 < 2e-16 ***
fYear7 0.21699 0.01176 18.457 < 2e-16 ***
fYear8 0.18269 0.01180 15.486 < 2e-16 ***
fYear9 0.03129 0.01201 2.605 0.0092 **
fYear10 -0.06817 0.01218 -5.596 2.19e-08 ***
fYear11 -0.02343 0.01211 -1.935 0.0530 .
fYear12 -0.02145 0.01212 -1.770 0.0767 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
benthos_fixed_locs_obs_0 |>
group_by(fYear) |>
summarise(mean_abundance = qlogis(mean(cover)),
median_abundance = qlogis(median(cover)),
sd_abundance = sd(qlogis(cover)) ,
mad_abundance = mad(qlogis(cover))
)
priors <- c(
prior(normal(0, 0.5), class = "b"),
prior(normal(-1.5, 0.5), class = "Intercept"),
prior(student_t(3, 0, 0.5), class = "sd")
)
mod_form <- bf(cover ~ fYear + (1 | Site) + (1 | Transect),
family = "Beta"
)
Show the code
brms_0_sum <-
mod_brms_0 |> emmeans(~fYear, at = newdata_0, type = "response") |>
as.data.frame() |>
rename(median = response, lower = lower.HPD, upper = upper.HPD) |>
mutate(Year = as.numeric(as.character(fYear))) |>
mutate(type = "brms")
saveRDS(brms_0_sum,
file = paste0(data_path, "synthetic/brms_0_sum.rds")
)
Show the code
brms_0_sum <- readRDS(
file = paste0(data_path, "synthetic/brms_0_sum.rds")
)
g <-
brms_0_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "brms")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_brms_0.png"
),
g,
width = 8, height = 6, dpi = 72
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- brms modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
Show the code
Family: beta
Links: mu = logit; phi = identity
Formula: cover ~ fYear + (1 | Site) + (1 | Transect)
Data: benthos_fixed_locs_obs_0 (Number of observations: 3000)
Draws: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
total post-warmup draws = 2400
Multilevel Hyperparameters:
~Site (Number of levels: 50)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.30 0.03 0.24 0.37 1.00 2084 2289
~Transect (Number of levels: 250)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.17 0.01 0.15 0.19 1.00 2193 2210
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.27 0.04 -1.35 -1.18 1.00 1643 2218
fYear2 0.02 0.01 -0.01 0.04 1.00 2004 2227
fYear3 0.05 0.01 0.03 0.07 1.00 2036 2327
fYear4 0.10 0.01 0.08 0.13 1.00 2206 2315
fYear5 0.15 0.01 0.12 0.17 1.00 2127 2197
fYear6 0.18 0.01 0.16 0.21 1.00 2213 2212
fYear7 0.22 0.01 0.19 0.24 1.00 2159 2181
fYear8 0.18 0.01 0.16 0.21 1.00 2177 2452
fYear9 0.03 0.01 0.01 0.06 1.00 2189 2136
fYear10 -0.07 0.01 -0.09 -0.04 1.00 2052 2191
fYear11 -0.02 0.01 -0.05 0.00 1.00 2272 2281
fYear12 -0.02 0.01 -0.05 0.00 1.00 2336 2369
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi 317.91 8.51 301.48 334.14 1.00 2363 2265
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Show the code
benthos_fixed_locs_obs_0 <-
benthos_fixed_locs_obs_0 |>
mutate(
cYear = fYear,
grid_id = factor(Reef),
cSite = factor(interaction(Reef, Site)),
cReplicate = ifelse(is.na(Transect),
interaction(Site, Year),
interaction(cSite, Transect)),
Cover = cover,
area = 1,
sum = 1
)
saveRDS(benthos_fixed_locs_obs_0,
file = paste0(data_path, "synthetic/saveRDS(benthos_fixed_locs_obs_0_forstan.rds")
)
stan_data <- prepare_data_for_stan(benthos_fixed_locs_obs_0, yrs = NULL)
model_stan <- cmdstanr::cmdstan_model(stan_file = "model1.stan")
## model_stan <- cmdstanr::cmdstan_model(stan_file = "mod1a.stan")
## model_stan <- cmdstanr::cmdstan_model(stan_file = "mod2a.stan")
Show the code
Show the code
stan_0_sum <-
mod_stan_0$draws(variables = "cellmeans") |>
posterior::as_draws_df() |>
posterior::summarise_draws(
median,
HDInterval::hdi,
~ HDInterval::hdi(., credMass = c(0.9)),
rhat,
ess_bulk,
ess_tail
) |>
rename(lower_90 = V4, upper_90 = V5) |>
bind_cols(newdata_0) |>
mutate(Year = as.numeric(as.character(fYear)))
saveRDS(stan_0_sum,
file = paste0(data_path, "synthetic/stan_0_sum.rds")
)
Show the code
stan_0_sum <- readRDS(
file = paste0(data_path, "synthetic/stan_0_sum.rds")
)
g <- stan_0_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "stan")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_stan_0.png"
),
g,
width = 8, height = 6, dpi = 72
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- stan modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
Show the code
mod_stan_0 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_0.rds")
)
color_scheme_set("viridis")
g <-
mod_stan_0$draws(variables = c("beta", "phi", "sd_1", "sd_2", "sd_3")) |>
mcmc_trace() +
theme_minimal()
ggsave(
filename = paste0(
fig_path, "R_stan_trace_0.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
mod_stan_0 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_0.rds")
)
color_scheme_set("viridis")
g <-
mod_stan_0$draws(variables = c("beta", "phi", "sd_1", "sd_2", "sd_3")) |>
mcmc_acf() +
theme_minimal()
ggsave(
filename = paste0(
fig_path, "R_stan_ac_0.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
mod_stan_0 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_0.rds")
)
g <-
bayesplot::pp_check(
benthos_fixed_locs_obs_0$cover,
mod_stan_0$draws("ypred", format = "matrix")[1:100, ],
ppc_dens_overlay
) +
theme_classic()
ggsave(
filename = paste0(
fig_path, "R_stan_ppc_0.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
phi 317.92 317.93 8.69 8.74 303.53 331.90 1.00 2854 2685
beta[1] -1.26 -1.26 0.05 0.04 -1.33 -1.18 1.00 2148 2386
beta[2] -1.24 -1.24 0.05 0.05 -1.32 -1.17 1.00 2210 2406
beta[3] -1.21 -1.21 0.05 0.05 -1.28 -1.13 1.00 2205 2393
beta[4] -1.15 -1.15 0.05 0.05 -1.23 -1.08 1.00 2164 2251
beta[5] -1.11 -1.11 0.05 0.05 -1.19 -1.04 1.00 2186 2374
beta[6] -1.08 -1.08 0.05 0.05 -1.15 -1.00 1.00 2149 2254
beta[7] -1.04 -1.04 0.05 0.05 -1.12 -0.97 1.00 2164 2571
beta[8] -1.08 -1.08 0.05 0.05 -1.15 -1.00 1.00 2168 2299
beta[9] -1.23 -1.23 0.05 0.05 -1.30 -1.15 1.00 2170 2365
beta[10] -1.33 -1.32 0.05 0.05 -1.40 -1.25 1.00 2174 2289
beta[11] -1.28 -1.28 0.05 0.05 -1.36 -1.21 1.00 2168 2391
beta[12] -1.28 -1.28 0.05 0.05 -1.35 -1.20 1.00 2206 2487
sd_1[1] 0.30 0.29 0.05 0.05 0.22 0.39 1.00 2892 2909
sd_2[1] 0.09 0.09 0.03 0.02 0.05 0.14 1.00 1525 1450
sd_3[1] 0.17 0.17 0.01 0.01 0.16 0.19 1.00 2469 2860
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
cellmeans[1] 0.22 0.22 0.01 0.01 0.21 0.23 1.00 2148 2386
cellmeans[2] 0.22 0.22 0.01 0.01 0.21 0.24 1.00 2210 2393
cellmeans[3] 0.23 0.23 0.01 0.01 0.22 0.24 1.00 2205 2393
cellmeans[4] 0.24 0.24 0.01 0.01 0.23 0.25 1.00 2164 2251
cellmeans[5] 0.25 0.25 0.01 0.01 0.23 0.26 1.00 2186 2374
cellmeans[6] 0.25 0.25 0.01 0.01 0.24 0.27 1.00 2149 2254
cellmeans[7] 0.26 0.26 0.01 0.01 0.25 0.28 1.00 2164 2571
cellmeans[8] 0.25 0.25 0.01 0.01 0.24 0.27 1.00 2168 2299
cellmeans[9] 0.23 0.23 0.01 0.01 0.21 0.24 1.00 2170 2365
cellmeans[10] 0.21 0.21 0.01 0.01 0.20 0.22 1.00 2174 2289
cellmeans[11] 0.22 0.22 0.01 0.01 0.20 0.23 1.00 2168 2391
cellmeans[12] 0.22 0.22 0.01 0.01 0.21 0.23 1.00 2207 2487
Show the code
gbm_0_sum <- readRDS(
file = paste0(data_path, "synthetic/gbm_0_sum.rds")
)
g <-
gbm_0_sum |>
ggplot() +
geom_line(aes(x = Year, y = median, color = "gbm")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_gbm_0.png"
),
g,
width = 8, height = 6, dpi = 72
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- gbm modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
Show the code
mod_gbm_0b <- gbm(cover ~ fYear + Latitude + Longitude + CYC + DHW + OTHER,
data = benthos_fixed_locs_obs_0,
distribution = "gaussian",
var.monotone = c(0, 0, 0, -1, -1, -1),
n.trees = 10000,
interaction.depth = 5,
shrinkage = 0.001,
bag.fraction = 0.5,
cv.folds = 5,
verbose = TRUE
)
saveRDS(mod_gbm_0b,
file = paste0(data_path, "synthetic/mod_gbm_0b.rds")
)
Show the code
newdata_0b <- benthos_fixed_locs_obs_0
gbm_0b_sum <- newdata_0b |>
mutate(median = predict(mod_gbm_0b, newdata_0b, n.trees = n.trees, type = "response")) |>
mutate(Year = as.numeric(as.character(fYear))) |>
## group_by(Year, Site, Transect) |>
## summarise(median = median(median)) |>
## group_by(Year, Site, .add = FALSE) |>
## summarise(median = median(median)) |>
group_by(Year, .add = FALSE) |>
summarise(median = median(median)) |>
mutate(type = "gbm")
saveRDS(gbm_0b_sum,
file = paste0(data_path, "synthetic/gbm_0b_sum.rds")
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- gbm modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
- relative influences mainly space/time
Show the code
dbarts_0c_sum <- preds_0c |>
t() |>
as_tibble(.name_repair = ~ paste0("V", seq_along(.))) |>
bind_cols(benthos_reefs_sf) |>
pivot_longer(
cols = matches("^V[0-9]*"),
names_to = ".draw", values_to = "fit"
) |>
group_by(Year, .draw) |>
summarise(fit = mean(fit)) |>
## dplyr::select(Year, fit, .draw) |>
group_by(Year, .add = FALSE) |>
summarise_draws(median, HDInterval::hdi)
saveRDS(dbarts_0c_sum,
file = paste0(data_path, "synthetic/dbarts_0c_sum.rds")
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- gbm modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
- relative influences mainly space/time
Show the code
dbarts_0b_sum <- preds |>
t() |>
as_tibble(.name_repair = ~ paste0("V", seq_along(.))) |>
bind_cols(benthos_fixed_locs_obs_0) |>
pivot_longer(
cols = matches("^V[0-9]*"),
names_to = ".draw", values_to = "fit"
) |>
group_by(Year, .draw) |>
summarise(fit = mean(fit)) |>
## dplyr::select(Year, fit, .draw) |>
group_by(Year, .add = FALSE) |>
summarise_draws(median, HDInterval::hdi)
saveRDS(dbarts_0b_sum,
file = paste0(data_path, "synthetic/dbarts_0b_sum.rds")
)
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- gbm modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- dbarts modelled estimates are consistent with the 25 site same data, yet biased lower than the “truth”
- sampled reefs (blue and green dashed lines) are lower than the true global average (red and khaki)
- dbarts modelled estimates are slightly elevated above the 25 site same data, yet still biased lower than the “truth”
Show the code
glmmTMB_1_sum <-
mod_glmmTMB_1 |> emmeans(~fYear, at = newdata_1, type = "response") |>
as.data.frame() |>
rename(median = response, lower = asymp.LCL, upper = asymp.UCL) |>
mutate(Year = as.numeric(as.character(fYear))) |>
mutate(type = "glmmTMB")
saveRDS(glmmTMB_1_sum,
file = paste0(data_path, "synthetic/glmmTMB_1_sum.rds")
)
Show the code
g1 <- glmmTMB_1_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "glmmTMB")) +
geom_line(data = mod_simple_1,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_1,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
## geom_line(data = all_sampled_sum,
## aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
g2 <- glmmTMB_1_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "glmmTMB")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "true mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "true median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_glmmTMB_1.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares glmmTMB modelled estimates to the sample data
- right figure compares glmmTMB modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and khaki dashed lines) are lower than the true global average (blue and purple) - right side figure
- glmmTMB modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
Show the code
glmmTMB_1_dharma <- readRDS(
file = paste0(data_path, "synthetic/glmmTMB_1_dharma.rds")
)
g <- wrap_elements(~testUniformity(glmmTMB_1_dharma)) +
wrap_elements(~plotResiduals(glmmTMB_1_dharma)) +
wrap_elements(~testDispersion(glmmTMB_1_dharma))
ggsave(
filename = paste0(
fig_path, "R_dharma_mod_glmmTMB_1.png"
),
g,
width = 10, height = 4, dpi = 72
)
Show the code
Family: beta ( logit )
Formula: cover ~ fYear + (1 | Site) + (1 | Transect)
Data: benthos_fixed_locs_obs_1
AIC BIC logLik -2*log(L) df.resid
-12668.6 -12579.1 6349.3 -12698.6 2865
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Site (Intercept) 0.08257 0.2874
Transect (Intercept) 0.02895 0.1701
Number of obs: 2880, groups: Site, 50; Transect, 250
Dispersion parameter for beta family (): 328
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.25922 0.04293 -29.331 < 2e-16 ***
fYear2 0.01431 0.01219 1.174 0.240579
fYear3 0.04112 0.01215 3.384 0.000714 ***
fYear4 0.09460 0.01208 7.834 4.73e-15 ***
fYear5 0.13212 0.01202 10.993 < 2e-16 ***
fYear6 0.16764 0.01196 14.017 < 2e-16 ***
fYear7 0.19975 0.01192 16.758 < 2e-16 ***
fYear8 0.16788 0.01196 14.039 < 2e-16 ***
fYear9 0.01501 0.01219 1.232 0.217997
fYear10 -0.07794 0.01234 -6.318 2.64e-10 ***
fYear11 -0.03260 0.01226 -2.658 0.007852 **
fYear12 -0.03354 0.01228 -2.731 0.006320 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
benthos_fixed_locs_obs_1 |>
group_by(fYear) |>
summarise(mean_abundance = qlogis(mean(cover)),
median_abundance = qlogis(median(cover)),
sd_abundance = sd(qlogis(cover)) ,
mad_abundance = mad(qlogis(cover))
)
priors <- c(
prior(normal(0, 0.5), class = "b"),
prior(normal(-1.5, 0.5), class = "Intercept"),
prior(student_t(3, 0, 0.5), class = "sd")
)
mod_form <- bf(cover ~ fYear + (1 | Site) + (1 | Transect),
family = "Beta"
)
Show the code
brms_1_sum <-
mod_brms_1 |> emmeans(~fYear, at = newdata_1, type = "response") |>
as.data.frame() |>
rename(median = response, lower = lower.HPD, upper = upper.HPD) |>
mutate(Year = as.numeric(as.character(fYear))) |>
mutate(type = "brms")
saveRDS(brms_1_sum,
file = paste0(data_path, "synthetic/brms_1_sum.rds")
)
Show the code
brms_1_sum <- readRDS(
file = paste0(data_path, "synthetic/brms_1_sum.rds")
)
g1 <-
brms_1_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "brms")) +
geom_line(data = mod_simple_1,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_1,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <-
brms_1_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "brms")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_brms_1.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares brms modelled estimates to the sample data
- right figure compares brms modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- brms modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
Show the code
Family: beta
Links: mu = logit; phi = identity
Formula: cover ~ fYear + (1 | Site) + (1 | Transect)
Data: benthos_fixed_locs_obs_1 (Number of observations: 2880)
Draws: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
total post-warmup draws = 2400
Multilevel Hyperparameters:
~Site (Number of levels: 50)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.30 0.03 0.24 0.38 1.00 2200 2249
~Transect (Number of levels: 250)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.17 0.01 0.15 0.19 1.00 2226 2052
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.26 0.04 -1.35 -1.17 1.00 2547 2162
fYear2 0.01 0.01 -0.01 0.04 1.00 2269 2234
fYear3 0.04 0.01 0.02 0.07 1.00 2351 2286
fYear4 0.09 0.01 0.07 0.12 1.00 2536 2363
fYear5 0.13 0.01 0.11 0.16 1.00 2438 2411
fYear6 0.17 0.01 0.14 0.19 1.00 2490 2057
fYear7 0.20 0.01 0.18 0.22 1.00 2513 2060
fYear8 0.17 0.01 0.14 0.19 1.00 2499 2288
fYear9 0.01 0.01 -0.01 0.04 1.00 2467 2180
fYear10 -0.08 0.01 -0.10 -0.05 1.00 2441 2345
fYear11 -0.03 0.01 -0.06 -0.01 1.00 2368 2409
fYear12 -0.03 0.01 -0.06 -0.01 1.00 2514 2325
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi 325.42 8.69 308.35 342.07 1.00 1910 2056
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Show the code
benthos_fixed_locs_obs_1 <-
benthos_fixed_locs_obs_1 |>
mutate(
cYear = fYear,
grid_id = factor(Reef),
cSite = factor(interaction(Reef, Site)),
cReplicate = ifelse(is.na(Transect),
interaction(Site, Year),
interaction(cSite, Transect)),
Cover = cover,
area = 1,
sum = 1
)
saveRDS(benthos_fixed_locs_obs_1,
file = paste0(data_path, "synthetic/saveRDS(benthos_fixed_locs_obs_1_forstan.rds")
)
stan_data <- prepare_data_for_stan(benthos_fixed_locs_obs_1, yrs = NULL)
model_stan <- cmdstanr::cmdstan_model(stan_file = "model1.stan")
## model_stan <- cmdstanr::cmdstan_model(stan_file = "mod1a.stan")
## model_stan <- cmdstanr::cmdstan_model(stan_file = "mod2a.stan")
Show the code
Show the code
stan_1_sum <-
mod_stan_1$draws(variables = "cellmeans") |>
posterior::as_draws_df() |>
posterior::summarise_draws(
median,
HDInterval::hdi,
~ HDInterval::hdi(., credMass = c(0.9)),
rhat,
ess_bulk,
ess_tail
) |>
rename(lower_90 = V4, upper_90 = V5) |>
bind_cols(newdata_1) |>
mutate(Year = as.numeric(as.character(fYear)))
saveRDS(stan_1_sum,
file = paste0(data_path, "synthetic/stan_1_sum.rds")
)
Show the code
stan_1_sum <- readRDS(
file = paste0(data_path, "synthetic/stan_1_sum.rds")
)
g1 <- stan_1_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "stan")) +
geom_line(data = mod_simple_1,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_1,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <- stan_1_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "stan")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_stan_1.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares stan modelled estimates to the sample data
- right figure compares stan modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- stan modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
Show the code
mod_stan_1 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_1.rds")
)
color_scheme_set("viridis")
g <-
mod_stan_1$draws(variables = c("beta", "phi", "sd_1", "sd_2", "sd_3")) |>
mcmc_trace() +
theme_minimal()
ggsave(
filename = paste0(
fig_path, "R_stan_trace_1.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
mod_stan_1 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_1.rds")
)
color_scheme_set("viridis")
g <-
mod_stan_1$draws(variables = c("beta", "phi", "sd_1", "sd_2", "sd_3")) |>
mcmc_acf() +
theme_minimal()
ggsave(
filename = paste0(
fig_path, "R_stan_ac_1.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
mod_stan_1 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_1.rds")
)
g <-
bayesplot::pp_check(
benthos_fixed_locs_obs_1$cover,
mod_stan_1$draws("ypred", format = "matrix")[1:100, ],
ppc_dens_overlay
) +
theme_classic()
ggsave(
filename = paste0(
fig_path, "R_stan_ppc_1.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
phi 325.64 325.55 8.82 8.87 311.07 340.37 1.00 2928 2945
beta[1] -1.26 -1.26 0.04 0.04 -1.33 -1.19 1.00 1909 2397
beta[2] -1.25 -1.25 0.04 0.04 -1.32 -1.18 1.00 1876 2383
beta[3] -1.22 -1.22 0.04 0.04 -1.29 -1.15 1.00 1905 2370
beta[4] -1.17 -1.17 0.04 0.04 -1.24 -1.09 1.00 1932 2295
beta[5] -1.13 -1.13 0.04 0.04 -1.20 -1.06 1.00 1879 2198
beta[6] -1.10 -1.10 0.04 0.04 -1.17 -1.02 1.00 1899 2318
beta[7] -1.06 -1.06 0.04 0.04 -1.14 -0.99 1.00 1953 2448
beta[8] -1.10 -1.10 0.04 0.04 -1.17 -1.02 1.00 1920 2298
beta[9] -1.25 -1.25 0.04 0.04 -1.32 -1.17 1.00 1912 2453
beta[10] -1.34 -1.34 0.04 0.04 -1.41 -1.26 1.00 1880 2133
beta[11] -1.30 -1.30 0.04 0.04 -1.37 -1.22 1.00 1914 2238
beta[12] -1.30 -1.30 0.04 0.04 -1.37 -1.22 1.00 1835 2141
sd_1[1] 0.29 0.29 0.05 0.05 0.22 0.38 1.00 2775 2808
sd_2[1] 0.09 0.09 0.03 0.02 0.05 0.14 1.00 1503 1960
sd_3[1] 0.17 0.17 0.01 0.01 0.16 0.19 1.00 2510 2905
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
cellmeans[1] 0.22 0.22 0.01 0.01 0.21 0.23 1.00 1909 2397
cellmeans[2] 0.22 0.22 0.01 0.01 0.21 0.24 1.00 1876 2383
cellmeans[3] 0.23 0.23 0.01 0.01 0.22 0.24 1.00 1905 2370
cellmeans[4] 0.24 0.24 0.01 0.01 0.22 0.25 1.00 1932 2295
cellmeans[5] 0.24 0.24 0.01 0.01 0.23 0.26 1.00 1879 2198
cellmeans[6] 0.25 0.25 0.01 0.01 0.24 0.26 1.00 1899 2318
cellmeans[7] 0.26 0.26 0.01 0.01 0.24 0.27 1.00 1953 2448
cellmeans[8] 0.25 0.25 0.01 0.01 0.24 0.26 1.00 1920 2298
cellmeans[9] 0.22 0.22 0.01 0.01 0.21 0.24 1.00 1912 2453
cellmeans[10] 0.21 0.21 0.01 0.01 0.20 0.22 1.00 1880 2133
cellmeans[11] 0.22 0.21 0.01 0.01 0.20 0.23 1.00 1914 2238
cellmeans[12] 0.21 0.21 0.01 0.01 0.20 0.23 1.00 1835 2141
Show the code
gbm_1_sum <- readRDS(
file = paste0(data_path, "synthetic/gbm_1_sum.rds")
)
g1 <-
gbm_1_sum |>
ggplot() +
geom_line(aes(x = Year, y = median, color = "gbm")) +
geom_line(data = mod_simple_1,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_1,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <-
gbm_1_sum |>
ggplot() +
geom_line(aes(x = Year, y = median, color = "gbm")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_gbm_1.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares gbm modelled estimates to the sample data
- right figure compares gbm modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- gbm modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
Show the code
mod_gbm_1b <- gbm(cover ~ fYear + Latitude + Longitude + CYC + DHW + OTHER,
data = benthos_fixed_locs_obs_1,
distribution = "gaussian",
var.monotone = c(0, 0, 0, -1, -1, -1),
n.trees = 10000,
interaction.depth = 5,
shrinkage = 0.001,
bag.fraction = 0.5,
cv.folds = 5,
verbose = TRUE
)
saveRDS(mod_gbm_1b,
file = paste0(data_path, "synthetic/mod_gbm_1b.rds")
)
Show the code
newdata_1b <- benthos_fixed_locs_obs_1
gbm_1b_sum <- newdata_1b |>
mutate(median = predict(mod_gbm_1b, newdata_1b, n.trees = n.trees, type = "response")) |>
mutate(Year = as.numeric(as.character(fYear))) |>
## group_by(Year, Site, Transect) |>
## summarise(median = median(median)) |>
## group_by(Year, Site, .add = FALSE) |>
## summarise(median = median(median)) |>
group_by(Year, .add = FALSE) |>
summarise(median = median(median)) |>
mutate(type = "gbm")
saveRDS(gbm_1b_sum,
file = paste0(data_path, "synthetic/gbm_1b_sum.rds")
)
- left figure compares gbm modelled estimates to the sample data
- right figure compares gbm modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- gbm modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
- relative influences mainly space/time
Show the code
dbarts_1c_sum <- preds_1c |>
t() |>
as_tibble(.name_repair = ~ paste0("V", seq_along(.))) |>
bind_cols(benthos_reefs_sf) |>
pivot_longer(
cols = matches("^V[0-9]*"),
names_to = ".draw", values_to = "fit"
) |>
group_by(Year, .draw) |>
summarise(fit = mean(fit)) |>
## dplyr::select(Year, fit, .draw) |>
group_by(Year, .add = FALSE) |>
summarise_draws(median, HDInterval::hdi)
saveRDS(dbarts_1c_sum,
file = paste0(data_path, "synthetic/dbarts_1c_sum.rds")
)
- left figure compares gbm modelled estimates to the sample data
- right figure compares gbm modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- gbm modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
- gbm derived predictions lack uncertainty estimates
- relative influences mainly space/time (though note that this is because this partial is the result of the same model, just predicted onto the full data grid)
Show the code
dbarts_1b_sum <- preds |>
t() |>
as_tibble(.name_repair = ~ paste0("V", seq_along(.))) |>
bind_cols(benthos_fixed_locs_obs_1) |>
pivot_longer(
cols = matches("^V[0-9]*"),
names_to = ".draw", values_to = "fit"
) |>
group_by(Year, .draw) |>
summarise(fit = mean(fit)) |>
## dplyr::select(Year, fit, .draw) |>
group_by(Year, .add = FALSE) |>
summarise_draws(median, HDInterval::hdi)
saveRDS(dbarts_1b_sum,
file = paste0(data_path, "synthetic/dbarts_1b_sum.rds")
)
- left figure compares dbarts modelled estimates to the sample data
- right figure compares dbarts modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- dbarts modelled estimates are slightly lower than the 25 site, replacement data, and thus biased lower than the “truth”
- left figure compares dbarts modelled estimates to the sample data
- right figure compares dbarts modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- dbarts modelled estimates are consistent with the 25 site, replacement data, yet biased lower than the “truth”
- relative influences mainly space/time
- left figure compares dbarts modelled estimates to the sample data
- right figure compares dbarts modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- dbarts modelled estimates are slightly higher than the 25 site, replacement data, yet biased lower than the “truth”
- relative influences mainly space/time (though note that this is because this partial is the result of the same model, just predicted onto the full data grid)
Show the code
glmmTMB_2_sum <-
mod_glmmTMB_2 |> emmeans(~fYear, at = newdata_2, type = "response") |>
as.data.frame() |>
rename(median = response, lower = asymp.LCL, upper = asymp.UCL) |>
mutate(Year = as.numeric(as.character(fYear))) |>
mutate(type = "glmmTMB")
saveRDS(glmmTMB_2_sum,
file = paste0(data_path, "synthetic/glmmTMB_2_sum.rds")
)
Show the code
g1 <- glmmTMB_2_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "glmmTMB")) +
geom_line(data = mod_simple_2,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_2,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <- glmmTMB_2_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "glmmTMB")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_glmmTMB_2.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares glmmTMB modelled estimates to the sample data
- right figure compares glmmTMB modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- glmmTMB modelled estimates are slightly higher than the 5 site, replacement data, yet biased lower than the “truth”
- the uncertainty is relatively high
Show the code
glmmTMB_2_dharma <- readRDS(
file = paste0(data_path, "synthetic/glmmTMB_2_dharma.rds")
)
g <- wrap_elements(~testUniformity(glmmTMB_2_dharma)) +
wrap_elements(~plotResiduals(glmmTMB_2_dharma)) +
wrap_elements(~testDispersion(glmmTMB_2_dharma))
ggsave(
filename = paste0(
fig_path, "R_dharma_mod_glmmTMB_2.png"
),
g,
width = 10, height = 4, dpi = 72
)
Show the code
Family: beta ( logit )
Formula: cover ~ fYear + (1 | Site) + (1 | Transect)
Data: benthos_fixed_locs_obs_2
AIC BIC logLik -2*log(L) df.resid
-2114.5 -2051.9 1072.2 -2144.5 465
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Site (Intercept) 0.11926 0.3453
Transect (Intercept) 0.02793 0.1671
Number of obs: 480, groups: Site, 10; Transect, 50
Dispersion parameter for beta family (): 378
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.131216 0.113515 -9.965 < 2e-16 ***
fYear2 -0.010299 0.027511 -0.374 0.708140
fYear3 -0.004781 0.027507 -0.174 0.862022
fYear4 0.047843 0.027351 1.749 0.080248 .
fYear5 0.091099 0.027192 3.350 0.000807 ***
fYear6 0.159378 0.026997 5.904 3.56e-09 ***
fYear7 0.131914 0.027094 4.869 1.12e-06 ***
fYear8 0.121663 0.027124 4.485 7.28e-06 ***
fYear9 -0.048782 0.027734 -1.759 0.078584 .
fYear10 -0.118284 0.028027 -4.220 2.44e-05 ***
fYear11 -0.068882 0.027907 -2.468 0.013577 *
fYear12 -0.101854 0.028192 -3.613 0.000303 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
benthos_fixed_locs_obs_2 |>
group_by(fYear) |>
summarise(mean_abundance = qlogis(mean(cover)),
median_abundance = qlogis(median(cover)),
sd_abundance = sd(qlogis(cover)) ,
mad_abundance = mad(qlogis(cover))
)
priors <- c(
prior(normal(0, 0.5), class = "b"),
prior(normal(-1.5, 0.5), class = "Intercept"),
prior(student_t(3, 0, 0.5), class = "sd")
)
mod_form <- bf(cover ~ fYear + (1 | Site) + (1 | Transect),
family = "Beta"
)
Show the code
brms_2_sum <-
mod_brms_2 |> emmeans(~fYear, at = newdata_2, type = "response") |>
as.data.frame() |>
rename(median = response, lower = lower.HPD, upper = upper.HPD) |>
mutate(Year = as.numeric(as.character(fYear))) |>
mutate(type = "brms")
saveRDS(brms_2_sum,
file = paste0(data_path, "synthetic/brms_2_sum.rds")
)
Show the code
brms_2_sum <- readRDS(
file = paste0(data_path, "synthetic/brms_2_sum.rds")
)
g1 <-
brms_2_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "brms")) +
geom_line(data = mod_simple_2,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_2,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <-
brms_2_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "brms")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_brms_2.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares brms modelled estimates to the sample data
- right figure compares brms modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- brms modelled estimates are slightly higher than the 5 site, replacement data, yet biased lower than the “truth”
- the uncertainty is relatively high
Show the code
Family: beta
Links: mu = logit; phi = identity
Formula: cover ~ fYear + (1 | Site) + (1 | Transect)
Data: benthos_fixed_locs_obs_2 (Number of observations: 480)
Draws: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
total post-warmup draws = 2400
Multilevel Hyperparameters:
~Site (Number of levels: 10)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.40 0.11 0.24 0.65 1.00 2192 2106
~Transect (Number of levels: 50)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.17 0.02 0.14 0.22 1.00 1984 2330
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.15 0.13 -1.42 -0.90 1.00 2414 2273
fYear2 -0.01 0.03 -0.06 0.04 1.00 2338 2430
fYear3 -0.00 0.03 -0.06 0.05 1.00 2524 2270
fYear4 0.05 0.03 -0.00 0.10 1.00 2165 2134
fYear5 0.09 0.03 0.04 0.15 1.00 2360 2410
fYear6 0.16 0.03 0.11 0.21 1.00 2515 2306
fYear7 0.13 0.03 0.08 0.18 1.00 2454 2311
fYear8 0.12 0.03 0.07 0.18 1.00 2560 2266
fYear9 -0.05 0.03 -0.10 0.01 1.00 2075 2061
fYear10 -0.12 0.03 -0.18 -0.06 1.00 2571 2322
fYear11 -0.07 0.03 -0.12 -0.01 1.00 2375 2369
fYear12 -0.10 0.03 -0.16 -0.04 1.00 2267 2270
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi 361.86 25.09 314.73 413.59 1.00 2414 2080
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Show the code
benthos_fixed_locs_obs_2 <-
benthos_fixed_locs_obs_2 |>
mutate(
cYear = fYear,
grid_id = factor(Reef),
cSite = factor(interaction(Reef, Site)),
cReplicate = ifelse(is.na(Transect),
interaction(Site, Year),
interaction(cSite, Transect)),
Cover = cover,
area = 1,
sum = 1
)
saveRDS(benthos_fixed_locs_obs_2,
file = paste0(data_path, "synthetic/saveRDS(benthos_fixed_locs_obs_2_forstan.rds")
)
stan_data <- prepare_data_for_stan(benthos_fixed_locs_obs_2, yrs = NULL)
model_stan <- cmdstanr::cmdstan_model(stan_file = "model1.stan")
## model_stan <- cmdstanr::cmdstan_model(stan_file = "mod1a.stan")
## model_stan <- cmdstanr::cmdstan_model(stan_file = "mod2a.stan")
Show the code
Show the code
stan_2_sum <-
mod_stan_2$draws(variables = "cellmeans") |>
posterior::as_draws_df() |>
posterior::summarise_draws(
median,
HDInterval::hdi,
~ HDInterval::hdi(., credMass = c(0.9)),
rhat,
ess_bulk,
ess_tail
) |>
rename(lower_90 = V4, upper_90 = V5) |>
bind_cols(newdata_2) |>
mutate(Year = as.numeric(as.character(fYear)))
saveRDS(stan_2_sum,
file = paste0(data_path, "synthetic/stan_2_sum.rds")
)
Show the code
stan_2_sum <- readRDS(
file = paste0(data_path, "synthetic/stan_2_sum.rds")
)
g1 <- stan_2_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "stan")) +
geom_line(data = mod_simple_2,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_2,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <- stan_2_sum |>
ggplot() +
geom_ribbon(aes(x = Year, ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(aes(x = Year, y = median, color = "stan")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_stan_2.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares brms modelled estimates to the sample data
- right figure compares brms modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- brms modelled estimates are slightly higher than the 5 site, replacement data, yet biased lower than the “truth”
- the uncertainty is relatively high, but not as high as brms or glmmTMB
Show the code
mod_stan_2 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_2.rds")
)
color_scheme_set("viridis")
g <-
mod_stan_2$draws(variables = c("beta", "phi", "sd_2", "sd_2", "sd_3")) |>
mcmc_trace() +
theme_minimal()
ggsave(
filename = paste0(
fig_path, "R_stan_trace_2.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
mod_stan_2 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_2.rds")
)
color_scheme_set("viridis")
g <-
mod_stan_2$draws(variables = c("beta", "phi", "sd_2", "sd_2", "sd_3")) |>
mcmc_acf() +
theme_minimal()
ggsave(
filename = paste0(
fig_path, "R_stan_ac_2.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
mod_stan_2 <- readRDS(
file = paste0(data_path, "synthetic/mod_stan_2.rds")
)
g <-
bayesplot::pp_check(
benthos_fixed_locs_obs_2$cover,
mod_stan_2$draws("ypred", format = "matrix")[1:100, ],
ppc_dens_overlay
) +
theme_classic()
ggsave(
filename = paste0(
fig_path, "R_stan_ppc_2.png"
),
g,
width = 10, height = 8, dpi = 72
)
Show the code
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
phi 361.47 360.40 24.95 24.24 321.72 405.20 1.00 2875 2749
beta[1] -1.21 -1.21 0.06 0.06 -1.31 -1.10 1.00 1887 2322
beta[2] -1.21 -1.21 0.06 0.06 -1.32 -1.11 1.00 1906 2203
beta[3] -1.21 -1.21 0.06 0.06 -1.31 -1.10 1.00 1959 2377
beta[4] -1.16 -1.16 0.07 0.06 -1.26 -1.05 1.00 1839 2060
beta[5] -1.11 -1.12 0.07 0.06 -1.22 -1.00 1.00 1933 2278
beta[6] -1.05 -1.05 0.07 0.06 -1.16 -0.94 1.00 1860 2147
beta[7] -1.08 -1.08 0.07 0.06 -1.18 -0.97 1.00 1916 2408
beta[8] -1.09 -1.09 0.07 0.06 -1.19 -0.98 1.00 1898 2322
beta[9] -1.25 -1.25 0.07 0.06 -1.36 -1.14 1.00 1858 2493
beta[10] -1.32 -1.32 0.07 0.06 -1.42 -1.21 1.00 1862 2209
beta[11] -1.27 -1.28 0.07 0.06 -1.38 -1.16 1.00 1907 2227
beta[12] -1.30 -1.31 0.07 0.06 -1.41 -1.19 1.00 1978 2210
sd_2[1] 0.14 0.12 0.09 0.07 0.03 0.32 1.00 2015 2457
sd_3[1] 0.17 0.17 0.02 0.02 0.14 0.21 1.00 2645 2723
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
cellmeans[1] 0.23 0.23 0.01 0.01 0.21 0.25 1.00 1887 2322
cellmeans[2] 0.23 0.23 0.01 0.01 0.21 0.25 1.00 1906 2203
cellmeans[3] 0.23 0.23 0.01 0.01 0.21 0.25 1.00 1959 2377
cellmeans[4] 0.24 0.24 0.01 0.01 0.22 0.26 1.00 1839 2060
cellmeans[5] 0.25 0.25 0.01 0.01 0.23 0.27 1.00 1933 2278
cellmeans[6] 0.26 0.26 0.01 0.01 0.24 0.28 1.00 1860 2147
cellmeans[7] 0.25 0.25 0.01 0.01 0.23 0.28 1.00 1916 2408
cellmeans[8] 0.25 0.25 0.01 0.01 0.23 0.27 1.00 1898 2322
cellmeans[9] 0.22 0.22 0.01 0.01 0.20 0.24 1.00 1858 2493
cellmeans[10] 0.21 0.21 0.01 0.01 0.19 0.23 1.00 1862 2209
cellmeans[11] 0.22 0.22 0.01 0.01 0.20 0.24 1.00 1907 2227
cellmeans[12] 0.21 0.21 0.01 0.01 0.20 0.23 1.00 1978 2210
Show the code
gbm_2_sum <- readRDS(
file = paste0(data_path, "synthetic/gbm_2_sum.rds")
)
g1 <-
gbm_2_sum |>
ggplot() +
geom_line(aes(x = Year, y = median, color = "gbm")) +
geom_line(data = mod_simple_2,
aes(x = Year, y = Mean, colour = "simple data mean"), linetype = "dashed") +
geom_line(data = mod_simple_2,
aes(x = Year, y = Median, colour = "simple data median"), linetype = "dashed") +
theme_bw()
g2 <-
gbm_2_sum |>
ggplot() +
geom_line(aes(x = Year, y = median, color = "gbm")) +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Mean, colour = "all mean"), linetype = "dashed") +
geom_line(data = benthos_reefs_temporal_summary,
aes(x = Year, y = Median, colour = "all median"), linetype = "dashed") +
geom_line(data = all_sampled_sum,
aes(x = Year, y = response, colour = type), linetype = "dashed") +
theme_bw()
ggsave(
filename = paste0(
fig_path, "R_pdp_mod_gbm_2.png"
),
g1 + g2,
width = 12, height = 6, dpi = 72
)
- left figure compares gbm modelled estimates to the sample data
- right figure compares gbm modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- gbm modelled estimates deviate substantially from the full 25 site dataset, particularly after year 9
- gbm modelled estimates are consistent with the observed 5 site replacement data - e.g. adversely impacted by the replacement of a poor reef with a good reef.
- gbm derived predictions lack uncertainty estimates
Show the code
mod_gbm_2b <- gbm(cover ~ fYear + Latitude + Longitude + CYC + DHW + OTHER,
data = benthos_fixed_locs_obs_2,
distribution = "gaussian",
var.monotone = c(0, 0, 0, -1, -1, -1),
n.trees = 10000,
interaction.depth = 5,
shrinkage = 0.001,
bag.fraction = 0.5,
cv.folds = 5,
verbose = TRUE
)
saveRDS(mod_gbm_2b,
file = paste0(data_path, "synthetic/mod_gbm_2b.rds")
)
Show the code
newdata_2b <- benthos_fixed_locs_obs_2
gbm_2b_sum <- newdata_2b |>
mutate(median = predict(mod_gbm_2b, newdata_2b, n.trees = n.trees, type = "response")) |>
mutate(Year = as.numeric(as.character(fYear))) |>
## group_by(Year, Site, Transect) |>
## summarise(median = median(median)) |>
## group_by(Year, Site, .add = FALSE) |>
## summarise(median = median(median)) |>
group_by(Year, .add = FALSE) |>
summarise(median = median(median)) |>
mutate(type = "gbm")
saveRDS(gbm_2b_sum,
file = paste0(data_path, "synthetic/gbm_2b_sum.rds")
)
- left figure compares gbm modelled estimates to the sample data
- right figure compares gbm modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- gbm modelled estimates deviate substantially from the full 25 site dataset, particularly after year 9
- gbm modelled estimates are consistent with the observed 5 site replacement data - e.g. adversely impacted by the replacement of a poor reef with a good reef.
- gbm derived predictions lack uncertainty estimates
- relative influences mainly space/time
Show the code
dbarts_2c_sum <- preds_2c |>
t() |>
as_tibble(.name_repair = ~ paste0("V", seq_along(.))) |>
bind_cols(benthos_reefs_sf) |>
pivot_longer(
cols = matches("^V[0-9]*"),
names_to = ".draw", values_to = "fit"
) |>
group_by(Year, .draw) |>
summarise(fit = mean(fit)) |>
## dplyr::select(Year, fit, .draw) |>
group_by(Year, .add = FALSE) |>
summarise_draws(median, HDInterval::hdi)
saveRDS(dbarts_2c_sum,
file = paste0(data_path, "synthetic/dbarts_2c_sum.rds")
)
- left figure compares gbm modelled estimates to the sample data
- right figure compares gbm modelled estimates to the simple hierarchical sampled means/medians and the true means/medians
- sampled reefs (blue and green dashed lines) are lower than the true global average (orange and khaki) - right side figure
- gbm modelled estimates are more consistent with the full 25 site dataset, e.g., they are less impacted by the replacement of a poor reef with a good reef.
- gbm modelled estimates deviate from the observed 5 site replacement data after year 9.
- gbm derived predictions lack uncertainty estimates
- relative influences mainly space/time
Show the code
dbarts_2b_sum <- preds |>
t() |>
as_tibble(.name_repair = ~ paste0("V", seq_along(.))) |>
bind_cols(benthos_fixed_locs_obs_2) |>
pivot_longer(
cols = matches("^V[0-9]*"),
names_to = ".draw", values_to = "fit"
) |>
group_by(Year, .draw) |>
summarise(fit = mean(fit)) |>
## dplyr::select(Year, fit, .draw) |>
group_by(Year, .add = FALSE) |>
summarise_draws(median, HDInterval::hdi)
saveRDS(dbarts_2b_sum,
file = paste0(data_path, "synthetic/dbarts_2b_sum.rds")
)