Performs model validation to estimate a PAI model's predictive performance using k-fold cross-validation or design-based probability sampling.
Usage
assess_pai_model(
gcp_data,
pai_method,
validation_type = "random",
k_folds = 5,
train_split_ratio = 0.8,
n_strata = 4,
seed = 123,
...
)
Arguments
- gcp_data
An
sf
object of homologous points, fromread_gcps()
.- pai_method
A character string specifying the algorithm to assess. One of:
helmert
,tps
,gam
,lm
,rf
,svmRadial
andsvmLinear
.- validation_type
A character string specifying the validation strategy. One of "random", "spatial", "probability", or "stratified".
- k_folds
An integer for the number of folds in CV. Only used for
validation_type
"random" and "spatial". Defaults to 5.- train_split_ratio
A numeric value between 0 and 1. The proportion of data for the training set. Used for "probability" and "stratified" types. Defaults to 0.8.
- n_strata
An integer specifying the number of strata to create for stratified sampling. Only used for
validation_type = "stratified"
. Defaults to 4 (quartiles).- seed
An integer for setting the random seed for reproducibility.
- ...
Additional arguments passed to the
train_pai_model
function.
Details
Model validation is crucial for understanding how well a model will generalize to new data. This function automates this process.
Validation Types:
random
: Standard k-fold cross-validation.spatial
: Spatial k-fold cross-validation.probability
: Design-based validation using a single train/test split based on simple random sampling.stratified
: Design-based validation using stratified random sampling. A single train/test split is performed. Strata are created based on the quantiles of the Euclidean distance of the error vectors (dx
,dy
), ensuring the validation set represents all error magnitudes proportionally.
Examples
if (FALSE) { # \dontrun{
# --- 1. create a demo data set
demo_files <- create_demo_data(seed = 42)
gcp_data <- read_gcps(gcp_path = demo_files$gcp_path)
# --- 2. Assess with RANDOM k-fold CV ---
random_assessment <- assess_pai_model(
gcp_data, pai_method = "rf", validation_type = "random", k_folds = 5
)
print(random_assessment)
# --- 3. Assess with SPATIAL k-fold CV ---
spatial_assessment <- assess_pai_model(
gcp_data, pai_method = "rf", validation_type = "spatial", k_folds = 5
)
print(spatial_assessment)
# --- 4. Assess with PROBABILITY (simple random) sampling ---
prob_assessment <- assess_pai_model(
gcp_data, pai_method = "rf", validation_type = "probability", train_split_ratio = 0.75
)
print(prob_assessment)
# --- 5. Assess with STRATIFIED probability sampling ---
stratified_assessment <- assess_pai_model(
gcp_data,
pai_method = "rf",
validation_type = "stratified",
train_split_ratio = 0.75,
n_strata = 4 # Use quartiles for stratification
)
print(stratified_assessment)
} # }