Getting Started with mapAI: A Complete Workflow
demo_data.Rmd
This vignette provides a complete, step-by-step workflow for using
the mapAI
package. For users new to the package, this is
the best place to start.
We will begin by generating a synthetic dataset that mimics a distorted historical map. This allows us to accurately measure how much our model improves the map’s accuracy. The workflow consists of two main parts:
- Positional Correction: A basic workflow to correct the geometry of a distorted map.
- Advanced Distortion Analysis: A deeper dive into quantifying and visualizing the nature of the distortion itself using techniques based on Tissot’s indicatrix theory.
# Load the necessary libraries
library(mapAI)
library(ggplot2)
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.4.0; sf_use_s2() is TRUE
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(magrittr) # For the %>% pipe operator
Part 1: Positional Correction Workflow
Step 1: Generate and Load Synthetic Data
First, we use create_demo_data()
to create our test
case. We’ll choose the "complex"
distortion type as it
represents a challenging combination of errors. We then load these files
into R using the package’s reading functions.
# Create the shapefile and GCPs CSV in a temporary directory
demo_files <- create_demo_data(type = "complex", seed = 123)
#> -> Homologous points saved to: /tmp/RtmpqUIQXg/demo_gcps.csv
#> -> Distorted map saved to: /tmp/RtmpqUIQXg/demo_map.shp
# Load the GCPs (homologous points) from the demo file
gcp_data <- read_gcps(gcp_path = demo_files$gcp_path, crs = 3857)
# Load the vector map that needs correction from the demo file
map_to_correct <- read_map(shp_path = demo_files$shp_path, crs = 3857)
Step 2: Assess Initial Error and Train a Model
We calculate the initial 2D Root Mean Squared Error (RMSE) to get a
baseline for the positional error and then we use cross validation to
estimate the out-of-sample error of a Generalized Additive Model
(gam
). GAMs are well-suited for capturing the
smooth, non-linear distortions in our synthetic data.
# Calculate the initial error before correction
identity_rmse <- sqrt(mean(gcp_data$dx^2 + gcp_data$dy^2))
print(paste("Initial (Identity) 2D RMSE:", round(identity_rmse, 4)))
#> [1] "Initial (Identity) 2D RMSE: 5.3465"
# use cross validation to assess a GAM's model accuracy
cv_res <- assess_pai_model(gcp_data,"gam")
#> Starting 5 -fold random cross-validation for method: gam on 225 complete points.
#> Creating random folds...
#> Processing Fold 1 of 5 ...
#> Training 'gam' model...
#> Processing Fold 2 of 5 ...
#> Training 'gam' model...
#> Processing Fold 3 of 5 ...
#> Training 'gam' model...
#> Processing Fold 4 of 5 ...
#> Training 'gam' model...
#> Processing Fold 5 of 5 ...
#> Training 'gam' model...
#> Assessment complete.
print(cv_res)
#> Method ValidationType Mean_RMSE_2D SD_RMSE_2D
#> 1 gam random 0.7495636 0.06099847
# Train the GAM model using our GCP data
pai_model_gam <- train_pai_model(gcp_data, method = "gam")
#> Training 'gam' model...
Step 3: Apply Correction and Visualize the Result
We apply our trained model to the map grid and then visualize the result by overlaying the corrected grid on the original.
# Apply the trained model to the full map
corrected_map <- apply_pai_model(pai_model = pai_model_gam, map = map_to_correct)
#> Applying PAI model to map features...
#> Correction complete.
# For easy plotting, add a "status" column and combine the maps
map_to_correct$status <- "Original (Distorted)"
corrected_map$status <- "Corrected"
comparison_data <- rbind(map_to_correct[, "status"], corrected_map[, "status"])
# Create the final comparison plot
ggplot(comparison_data) +
geom_sf(aes(color = status, linetype = status), fill = NA, linewidth = 0.7) +
scale_color_manual(values = c("Original (Distorted)" = "grey50", "Corrected" = "red")) +
scale_linetype_manual(values = c("Original (Distorted)" = "dashed", "Corrected" = "solid")) +
labs(
title = "Map Correction Comparison",
subtitle = "Overlay of the original and corrected map grids"
) +
theme_minimal()
The plot clearly shows how the warped grid has been adjusted back to a
regular shape, demonstrating a successful correction.
Part 2: Advanced Distortion Analysis
Beyond simple correction, mapAI
provides powerful tools
to quantify and visualize the distortion itself. This is useful for
understanding the history of a map’s creation or identifying areas of
high uncertainty.
Step 4: Run the Distortion Analysis
We first create a regular grid of points for our
analysis. Then, we use the analyze_distortion()
function
with our trained gam_model
to calculate the detailed
distortion metrics at each point.
# Create a regular grid of points for the analysis
analysis_points <- sf::st_make_grid(gcp_data, n = c(25, 25)) %>%
sf::st_centroid() %>%
sf::st_sf()
# Run the analysis
distortion_results <- analyze_distortion(pai_model_gam, analysis_points)
# Glimpse the new columns added to our data
glimpse(distortion_results)
#> Rows: 625
#> Columns: 9
#> $ geometry <POINT [m]> POINT (1.987481 -1.918547), POINT (6.0986…
#> $ a <dbl> 1.0201617, 1.0132633, 1.0082489, 1.0056066, 1.0…
#> $ b <dbl> 0.9378401, 0.9356228, 0.9331371, 0.9299045, 0.9…
#> $ area_scale <dbl> 0.9567486, 0.9480322, 0.9408344, 0.9351181, 0.9…
#> $ log2_area_scale <dbl> -0.06378823, -0.07699201, -0.08798727, -0.09677…
#> $ max_shear <dbl> 2.409636, 2.283177, 2.217313, 2.241536, 2.30422…
#> $ max_angular_distortion <dbl> 0.08411216, 0.07969792, 0.07739883, 0.07824438,…
#> $ airy_kavrayskiy <dbl> 0.002258491, 0.002300784, 0.002428281, 0.002656…
#> $ theta_a <dbl> -0.01012249, -4.08984269, -8.30682561, -11.7028…
Step 5: Visualize Distortion Metrics
Now we can use plot_distortion_surface()
to visualize
these metrics.
Areal Distortion (log2σ) and Angular Distortion (2Ω)
These are two of the most fundamental distortion metrics.
log2_area_scale
shows where the map has been stretched or
shrunk, while max_angular_distortion_rad
shows where angles
have been deformed (shear).
# Plot for log2(sigma) - areal distortion
plot_area <- plot_distortion_surface(
distortion_results,
metric = "log2_area_scale",
diverging = TRUE
) +
labs(title = "Areal Distortion (log2σ)")
#> Regular grid detected. Creating a surface plot with geom_raster().
# Plot for 2*Omega - angular distortion
plot_shear <- plot_distortion_surface(
distortion_results,
metric = "max_angular_distortion",
palette = "magma"
) +
labs(title = "Angular Distortion (2Ω)")
#> Regular grid detected. Creating a surface plot with geom_raster().
plot_shear
#> Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
#> ℹ Consider using `geom_tile()` instead.
#> Raster pixels are placed at uneven horizontal intervals and will be shifted
#> ℹ Consider using `geom_tile()` instead.
plot_area
#> Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
#> ℹ Consider using `geom_tile()` instead.
#> Raster pixels are placed at uneven horizontal intervals and will be shifted
#> ℹ Consider using `geom_tile()` instead.
Airy-Kavrayskiy Measure
This is a combined measure of distortion that balances both areal and
angular errors. We can calculate it from the a
and
b
values produced by analyze_distortion
.
# Plot the Airy-Kavrayskiy metric
plot_distortion_surface(
distortion_results,
metric = "airy_kavrayskiy",
palette = "cividis"
) +
labs(title = "Airy-Kavrayskiy Combined Distortion Measure")
#> Regular grid detected. Creating a surface plot with geom_raster().
#> Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
#> ℹ Consider using `geom_tile()` instead.
#> Raster pixels are placed at uneven horizontal intervals and will be shifted
#> ℹ Consider using `geom_tile()` instead.
Tissot’s Indicatrices
Finally, plot_indicatrices
provides the richest
visualization, showing the distortion ellipses themselves. The shape,
size, and orientation of each ellipse represent the local distortion
pattern.
# Plot the indicatrices at the locations of the analysis points
# We use a large scale_factor to make the ellipses clearly visible.
plot_indicatrices(
distortion_sf = distortion_results,
scale_factor = 1,
fill_color = "lightblue",
border_color = "black"
)
#> Generating indicatrix polygons at source locations...
This advanced analysis demonstrates how
mapAI
can be used
not only to fix historical maps, but to deeply understand their
geometric properties.