Usage

Here I show an example of the geo-rectification process using a photograph of NIES’ long-term monitoring taken at Tateyama Murodo-Sanso, Toyama prefecture, Japan.

# Loading requirements
from alproj.surface import get_colored_surface
from alproj.project import sim_image, reverse_proj, to_geotiff
from alproj.gcp import image_match, set_gcp, filter_gcp_distance
from alproj.optimize import CMAOptimizer, LsqOptimizer
import rasterio
import cv2

Data Preparation

You should prepare below before starting.

  • The target landscape photograph.

  • An orthorectificated airborne photograph.

  • A Digital Surface Model.

Both the airborne photograph and the DSM must cover hole the area where the target photograph covers. Both of them must be in the same planar Coordinate Reference System, e.g. Universal Transverse Mercator Coordinate System (UTM).

Loading Raster Data

First, load the airborne photograph and the DSM using rasterio.

res = 1.0 # Resolution in m
aerial = rasterio.open("airborne.tif")
dsm = rasterio.open("dsm.tif")

Define Initial Camera Parameters

Setting initial camera parameters for optimization.

  • x, y, z: A shooting point coordinate in the CRS of the aerial photograph / DSM. These can also be optimized to correct GPS errors.

  • fov: A Field of View in degree.

  • pan, tilt, roll: A set of Euler angles of the camera in degree.

  • a1 ~ s4: Distortion coefficients. See Algorithm for detail.

  • w, h: The width and height of the target image in pixel.

  • cx, cy: A coordinate of the principal point in pixel.

params = {"x":732731,"y":4051171, "z":2458, "fov":70, "pan":100, "tilt":0, "roll":0,\
          "a1":1, "a2":1, "k1":0, "k2":0, "k3":0, "k4":0, "k5":0, "k6":0, \
          "p1":0, "p2":0, "s1":0, "s2":0, "s3":0, "s4":0, \
          "w":5616, "h":3744, "cx":5616/2, "cy":3744/2}

Rendering a Simulated Landscape Image

To find a set of Ground Control Points, render a simulated landscape image with the aerial photograph, DSM, and the initial camera parameters.

First, get colored surface from aerial photograph and DSM.

distance = 3000 # Distance from shooting point in meters

vert, col, ind, offsets = get_colored_surface(aerial, dsm, shooting_point=params, distance=distance, res=res) # This takes some minutes.

Then you’ll get four np.arrays looks like below.

  • vert

    Vertex coordinates of each point relative to offsets. In x, z, y order.

    >>> vert
    array([[3.00000000e+03, 3.84212890e+02, 4.65500000e+03],
         [3.00100000e+03, 3.82377200e+02, 4.65500000e+03],
         [2.99900000e+03, 3.84474370e+02, 4.65400000e+03],
         ...,
         [2.23200000e+03, 9.98540000e-01, 0.00000000e+00],
         [2.23300000e+03, 5.04880000e-01, 0.00000000e+00],
         [2.23400000e+03, 0.00000000e+00, 0.00000000e+00]])
    
  • col

    Vertex colors in 0 to 1.

    >>> col
    array([[0.37647059, 0.35686275, 0.32156863],
         [0.36078431, 0.33333333, 0.30980392],
         [0.42352941, 0.40392157, 0.36078431],
         ...,
         [0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        ]])
    
  • ind

    The index shows which three points form a triangle.

    >>> ind
    array([[      0,       3,       4],
         [      0,       4,       1],
         [      1,       4,       5],
         ...,
         [7877844, 7878551, 7877845],
         [7877845, 7878551, 7878552],
         [7877845, 7878552, 7877846]], dtype=int64)
    
  • offsets

    Offset values for vertex coordinates. You need to pass this to sim_image and reverse_proj.

    >>> offsets
    array([7.31942032e+05, 2.15609204e+03, 4.04854197e+06])
    

Next, render a simulated landscape image. You can optionally use min_distance to mask pixels closer than a specified distance (useful for preventing mismatches with near-field objects).

sim = sim_image(vert, col, ind, params, offsets, min_distance=100)  # mask closer than 100m
cv2.imwrite("sim_initial.png", sim)

Every pixel in this image has geographic coordinates. Then you can get a table of image coordinates and geographic coordinates of it.

df = reverse_proj(sim, vert, ind, params, offsets)
>>> df
             u     v            x           y            z      B      G      R
2058832   3376   366  734200.3125  4050691.75  2988.827881  116.0  120.0  124.0
2058833   3377   366  734199.6875  4050691.75  2988.624268  106.0  110.0  113.0
2058834   3378   366  734198.7500  4050691.25  2988.337402   82.0   86.0   88.0
2058835   3379   366  734198.0000  4050691.25  2988.081543   70.0   75.0   78.0
2058836   3380   366  734197.3750  4050691.25  2987.862061   60.0   65.0   68.0
...        ...   ...          ...         ...          ...    ...    ...    ...
21026299  5611  3743  732740.3125  4051161.75  2453.355469  113.0  117.0  148.0
21026300  5612  3743  732740.3125  4051161.75  2453.355469  113.0  117.0  148.0
21026301  5613  3743  732740.3125  4051161.75  2453.355713  113.0  117.0  148.0
21026302  5614  3743  732740.3125  4051161.75  2453.355713  113.0  117.0  148.0
21026303  5615  3743  732740.3125  4051161.75  2453.355713  113.0  117.0  148.0

[17336750 rows x 8 columns]

Finding Ground Contorol Points

Then, you can add some Ground Contorol Points (GCPs) in the target image by matching target image and simulated image.

path_org = "target_image.jpg"
path_sim = "sim_initial.png"

Comparison of Matching Methods

The table below shows the comparison of all available methods on a 5616x3744 pixel image pair (CPU on M4 Pro MBP).

Built-in methods (no extra dependencies):

Method

Time

Matches

Notes

akaze

~1 sec

176

Fastest, fewer matches

sift

~4 sec

331

Good balance for simple cases

LightGlue-based methods (lightweight, handles full resolution):

Method

Time

Matches

Notes

sift-lightglue

~3 sec

518

SIFT features + LightGlue matcher

superpoint-lightglue

~4 sec

962

Recommended for most cases

aliked-lightglue

~6 sec

815

ALIKED features + LightGlue matcher

minima-superpoint-lightglue

~29 sec

973

MiniMa preprocessing + SuperPoint

Semi-dense methods:

Method

Time

Matches

Notes

xfeat-star

~14 sec

112

XFeat semi-dense matching

Dense matching methods (high match count, auto-resized to 640px):

Method

Time

Matches

Notes

tiny-roma

~2 sec

2047

Fast dense matching

loftr

~2 sec

2198

Good for low-texture regions

minima-loftr

~3 sec

2629

MiniMa + LoFTR

ufm

~10 sec

2048

UFM matching

rdd

~14 sec

1494

RDD matching

roma

~23 sec

2048

RoMa dense matching

master

~327 sec

3515

MASTER matching (large model)

minima-roma

GPU only

-

Best quality, large model requires CUDA

minima-roma-tiny

~2 sec (est.)

-

CPU-compatible variant of MiniMa-RoMa

Note: Dense matching methods (non-LightGlue) automatically resize images to 640px when resize is not specified to prevent out-of-memory errors. Keypoints are scaled back to original coordinates.

You can reproduce this comparison using the script scripts/compare_matching_methods.py.

SIFT result:

SuperPoint-LightGlue result:

MiniMa-RoMa result:

For most cases, SuperPoint-LightGlue provides a good balance between speed and robustness. Use SIFT when you cannot install additional dependencies, or MiniMa-RoMa when you need the highest matching quality for difficult image pairs.

Outlier Filtering

By default, image_match() applies Fundamental Matrix filtering (outlier_filter="fundamental"). You can also use Essential Matrix filtering when camera parameters are available:

# Essential Matrix filtering (recommended when params with fov is available)
match, plot = image_match(
    path_org, path_sim,
    method="minima-roma",
    outlier_filter="essential",  # "essential", "fundamental", or "none"
    params=params,               # camera params with fov, w, h (focal length computed automatically)
    threshold=10.0,              # MAGSAC threshold in pixels
    plot_result=True
)

Outlier filtering methods:

  • "fundamental": Fundamental Matrix with MAGSAC++ (default, no camera intrinsics required)

  • "essential": Essential Matrix with MAGSAC++ (recommended when params with fov is provided)

  • "none": No filtering. Use this when you plan to apply custom filtering later.

Spatial Thinning

When using dense matching methods like MiniMa-RoMa, matches may cluster in certain regions of the image. Use spatial thinning to ensure uniform distribution:

# Match with spatial thinning (keeps at most 1 point per 100x100 pixel region)
match, plot = image_match(
    path_org, path_sim,
    method="minima-roma",
    spatial_thin_grid=50,           # Grid cell size in pixels
    spatial_thin_selection="center", # "first", "random", or "center"
    device="cuda",
    plot_result=True
)

Selection methods:

  • "first": Keeps the first point by input order (fastest, deterministic)

  • "random": Random selection (use spatial_thin_random_state for reproducibility)

  • "center": Keeps the point closest to the cell center (best for uniform distribution)

Spatial thinning is applied AFTER geometric outlier filtering, so it samples from inliers only.

Distance-based GCP Filtering

After creating GCPs with set_gcp(), you can filter them based on 3D distance from the camera:

from alproj.gcp import image_match, set_gcp, filter_gcp_distance

# Create GCPs from matches
gcps = set_gcp(match, df)

# Filter: exclude points closer than 100m or farther than 2000m
gcps = filter_gcp_distance(gcps, params, min_distance=100, max_distance=2000)

This is useful for:

  • Excluding nearby foreground objects (min_distance)

  • Excluding distant features with poor depth accuracy (max_distance)

Important: Coordinates must be in a projected CRS (e.g., UTM) for accurate Euclidean distance. Using lat/lon directly will produce incorrect results.

All Available Methods

Built-in (no extra dependencies):

  • akaze: AKAZE local features

  • sift: SIFT local features

With vismatch package (pip install alproj[vismatch]):

Any method supported by vismatch (70+ models) can be used. Some popular choices:

  • sift-lightglue, superpoint-lightglue, disk-lightglue, aliked-lightglue, dedode-lightglue: LightGlue-based (lightweight, handles full resolution)

  • xfeat, xfeat-lightglue, xfeat-star: XFeat variants (fast and lightweight)

  • roma, romav2, tiny-roma, minima-roma: RoMa variants (dense matching)

  • loftr, eloftr, minima-loftr: LoFTR variants (good for low-texture regions)

  • ufm, rdd, master, duster, edm: Other dense matching methods

See the vismatch README for the full list.

>>> gcps
         u     v              x             y            z
0     1468  2751  733134.120287  4.051270e+06  2367.014130
1     2362  1271  733733.878100  4.051172e+06  2613.651184
2     5116  2709  732801.371752  4.051120e+06  2441.953796
3      644  2253  733323.715014  4.051434e+06  2401.383240
4     3944   953  733846.427660  4.050684e+06  2738.904907
...    ...   ...            ...           ...          ...
1122  2850  1748  733444.243822  4.051082e+06  2480.453796
1123  3950  1132  733707.379809  4.050744e+06  2655.842163
1124  1043  1969  733419.895678  4.051407e+06  2440.836182
1125   750  1765  733541.844896  4.051506e+06  2479.430786
1126  3454   824  733863.009447  4.050836e+06  2778.318420

[1127 rows x 5 columns]

Where u and v stands for the x and y axis coordinates in the image coordinate system.

Optimization of Camera Parameters

Finally, optimizing camera parameters using GCPs. Camera parameters are optimized by minimizing reproection errors. You can specify which parameters to be optimized, including camera position (x, y, z).

Two optimizers are available:

LsqOptimizer (Fast Local Refinement)

LsqOptimizer uses scipy’s least_squares with Trust Region Reflective algorithm. Much faster than CMA-ES but requires a good initial guess. Supports robust loss functions.

lsq_optimizer = LsqOptimizer(obj_points, img_points, params)
lsq_optimizer.set_target(["k1", "k2", "k3", "k4", "k5", "k6", "p1", "p2", "s1", "s2", "s3", "s4"])
params_optim, error = lsq_optimizer.optimize(
    method="trf",        # "trf", "dogbox", or "lm"
    loss="huber",        # "linear", "huber", "soft_l1", "cauchy", "arctan"
    f_scale=10.0,        # threshold for robust loss functions
    max_nfev=1000        # maximum number of function evaluations
)

Key parameters:

  • method: Algorithm to use (“trf” recommended, “lm” does not support bounds/robust loss)

  • loss: Loss function (“huber” and “soft_l1” are robust to outliers)

  • f_scale: Soft threshold for residuals (larger = more tolerance to outliers)

The optimized camera parameters reproduces the target image exactly.
```python
sim_optim = sim_image(vert, col, ind, params_optim, offsets)
cv2.imwrite("sim_optimized.png", sim_optim)

Reverse Projection

Now you can get geographic coordinates of each pixel of the target image.

original = cv2.imread("target_image.jpg")
georectified = reverse_proj(original, vert, ind, params_optim, offsets)
>>> georectified
             u     v              x             y            z      B      G      R
3030570   3546   539  734196.643725  4.050693e+06  2987.922119  193.0  153.0  128.0
3030571   3547   539  734195.270678  4.050693e+06  2987.445068  195.0  155.0  130.0
3030572   3548   539  734193.905932  4.050693e+06  2986.971313  192.0  154.0  124.0
3030573   3549   539  734192.899340  4.050693e+06  2986.625610  187.0  149.0  119.0
3030574   3550   539  734192.235033  4.050693e+06  2986.404175  186.0  149.0  115.0
...        ...   ...            ...           ...          ...    ...    ...    ...
20655647  5615  3677  732743.775072  4.051166e+06  2452.745117  130.0  174.0  191.0
20661260  5612  3678  732743.765063  4.051166e+06  2452.746643  115.0  161.0  179.0
20661261  5613  3678  732743.764086  4.051166e+06  2452.746887  115.0  161.0  179.0
20661262  5614  3678  732743.764086  4.051166e+06  2452.746887  119.0  165.0  182.0
20661263  5615  3678  732743.762987  4.051166e+06  2452.747192  124.0  170.0  187.0

[15678803 rows x 8 columns]

Exporting to GeoTIFF

You can convert the reverse projection output directly to a GeoTIFF using the built-in to_geotiff() function:

from alproj.project import to_geotiff

# Convert to GeoTIFF with automatic rasterization and interpolation
to_geotiff(
    georectified,
    "output.tif",
    resolution=1.0,           # Pixel resolution in coordinate units (e.g., meters)
    crs="EPSG:6690",          # Coordinate Reference System
    bands=["R", "G", "B"],    # Which columns to use as bands
    interpolate=True,         # Fill small gaps using focal statistics
    max_dist=1.0,             # Maximum interpolation distance
    agg_func="mean",          # Aggregation function: "mean", "median", "max", "min"
    nodata=255                # NoData value for missing pixels
)

You can also export to CSV for use with other tools:

georectified.to_csv("georectified.csv", index=False)

Result Plot