Skip to contents

Take a dataframe of points containing geometries of points and additional_variable_cols values and cluster using the skater algorithm. This clustering is performed top-down the costs of pruning each minimum spanning tree edge. Only points connected by edges are able to cluster together. For additional information see insert citation/link.

Usage

reef_skater(
  points,
  n_clust = round(min(10000, nrow(points))/200),
  site_size = 250 * 250,
  x_col = "X_standard",
  y_col = "Y_standard",
  habitat_col = "habitat",
  id_col = "UNIQUE_ID",
  additional_variable_cols = c("depth_standard"),
  parallelisation = "Windows",
  hex_resolution = 12
)

Arguments

points

data.frame. Contains values for X and Y coordinates, as well as additional_variable_cols.

n_clust

integer numeric. Number of clusters in result output. (Point to cut hierarchical clustering tree). Default = (round(nrow(points) / 200)) (dividing habitat into clusters containing an average of 200 points).

site_size

numeric. Desired site size (area in m^2). Default = 625,000 (250m x 250m).

x_col

character. Name of the column holding X coordinates. Default = "X_standard".

y_col

character. Name of the column holding Y coordinates. Default = "Y_standard".

habitat_col

character. Column holding unique habitat values (attached to id_col value and site_id values on output). Default = "habitat".

id_col

character. Column holding ID value for the target reef (attached to the site_id values on output). Default = "UNIQUE_ID".

additional_variable_cols

character vector. Names of additional columns to contribute to the distance matrix. Default = c("depth_standard").

parallelisation

character. Current option is only "Windows", using this option sets up a parallel::Cluster using detectCores() - 2 cores. This parallelises prunecost calculations within spdep::skater(). If parallelisation is not set to "Windows", no parallelisation will occur. Default = "Windows".

hex_resolution

integer numeric. H3 hexagon resolution used in point creation.

Value

data.frame of points with allocated site_ids based on cluster outputs. site_id values are a combination of the id_col value, habitat_col value and the cluster allocation.