Cluster points together using the adespatial::constr.hclust algorithm.
Source:R/constrained_hclust.R
constrained_hclust.RdTake a dataframe of points containing additional_variable_cols
values and edges between points in the dataframe and cluster using the
constr.hclust algorithm. This clustering is performed bottom-up based on a
combined distance matrix of geographic and additional_variable_cols distances
Only points connected by edges are able to cluster together. For additional
information see insert citation/link. If points contains over 30,000
observations then interpolation will be used to reduce RAM usage. This process
randomly samples 30,000 for clustering and assigns clusters to the remaining
points using nearest neighbour interpolation.
Usage
constrained_hclust(
points,
edges,
x_col = "X_standard",
y_col = "Y_standard",
additional_variable_cols = c("depth_standard"),
id_col = "UNIQUE_ID",
habitat_col = "habitat",
distance_method = "manhattan",
distance_alpha = 0.5,
beta = -1,
n_points = 204,
n_clust = (round(nrow(points)/n_points)),
method = "ward.D2"
)Arguments
- points
data.frame. Contains values for X and Y coordinates, as well as
additional_variable_cols.- edges
matrix. Matrix containing edges between points in
points.- x_col
character. Name of the column holding X coordinates. Default = "X_standard".
- y_col
character. Name of the column holding Y coordinates. Default = "Y_standard".
- additional_variable_cols
character vector. Names of additional columns to contribute to the distance matrix. Default = c("depth_standard").
- id_col
character. Column holding ID value for the target reef (attached to the site_id values on output). Default = "UNIQUE_ID".
- habitat_col
character. Column holding unique habitat values (attached to
id_colvalue and site_id values on output). Default = "habitat".- distance_method
character. Distance matrix creation method. Default = "manhattan" (see dist()).
- distance_alpha
float numeric. Weighting applied to
additional_variable_colsdistance matrix when combining with geographic distances. (1 - alpha) weighting is applied to the geographic distance matrix. Default = 0.5 (symmetric weighting).- beta
float numeric. Beta parameter used by adespatial::constr.hclust. Parameter value is only used if
method== "flexible". Default = -1.- n_points
integer numeric. Desired number of points in resulting clusters. Used to calculate n_clust (number of output clusters). Value only used in n_clust specification. Default = 204.
- n_clust
integer numeric. Number of clusters in result output. (Point to cut hierarchical clustering tree). Default = (round(nrow(points) / n_points)) (dividing habitat into clusters containing an average of 200 points).
- method
character. Clustering method to be applied. See adespatial::constr.hclust() for more details. Default = "ward.D2".