# Usage
Here I show an example of the geo-rectification process using a photograph of [NIES' long-term monitoring](https://db.cger.nies.go.jp/gem/ja/mountain/station.html?id=2) taken at Tateyama Murodo-Sanso, Toyama prefecture, Japan.

```python
# Loading requirements
from alproj.surface import get_colored_surface
from alproj.project import sim_image, reverse_proj, to_geotiff
from alproj.gcp import image_match, set_gcp, filter_gcp_distance
from alproj.optimize import CMAOptimizer, LsqOptimizer
import rasterio
import cv2
```
## Data Preparation
You should prepare below before starting.
- The target landscape photograph.
![](_static/ttym_2016.jpg)
- An orthorectificated airborne photograph.
![](_static/airborne.png)
- A Digital Surface Model.
![](_static/dem.png)

Both the airborne photograph and the DSM must cover hole the area where the target photograph covers. 
Both of them must be in the same planar Coordinate Reference System, e.g. Universal Transverse Mercator Coordinate System (UTM).

## Loading Raster Data
First, load the airborne photograph and the DSM using rasterio.
```python
res = 1.0 # Resolution in m
aerial = rasterio.open("airborne.tif")
dsm = rasterio.open("dsm.tif")
```

## Define Initial Camera Parameters
Setting initial camera parameters for optimization.
- x, y, z: A shooting point coordinate in the CRS of the aerial photograph / DSM. These can also be optimized to correct GPS errors.
- fov: A Field of View in degree.
- pan, tilt, roll: A set of Euler angles of the camera in degree.
- a1 ~ s4: Distortion coefficients. See [Algorithm](https://alproj.readthedocs.io/en/latest/overview.html#algorithm) for detail.
- w, h: The width and height of the target image in pixel.
- cx, cy: A coordinate of the principal point in pixel.
 
```python
params = {"x":732731,"y":4051171, "z":2458, "fov":70, "pan":100, "tilt":0, "roll":0,\
          "a1":1, "a2":1, "k1":0, "k2":0, "k3":0, "k4":0, "k5":0, "k6":0, \
          "p1":0, "p2":0, "s1":0, "s2":0, "s3":0, "s4":0, \
          "w":5616, "h":3744, "cx":5616/2, "cy":3744/2}
```

## Rendering a Simulated Landscape Image
To find a set of Ground Control Points, render a simulated landscape image with the aerial photograph, DSM, and the initial camera parameters.

First, get colored surface from aerial photograph and DSM.
```python
distance = 3000 # Distance from shooting point in meters

vert, col, ind, offsets = get_colored_surface(aerial, dsm, shooting_point=params, distance=distance, res=res) # This takes some minutes.
```
Then you'll get four `np.array`s looks like below.
- vert

  Vertex coordinates of each point relative to offsets. In x, z, y order.
  ```
  >>> vert
  array([[3.00000000e+03, 3.84212890e+02, 4.65500000e+03],
       [3.00100000e+03, 3.82377200e+02, 4.65500000e+03],
       [2.99900000e+03, 3.84474370e+02, 4.65400000e+03],
       ...,
       [2.23200000e+03, 9.98540000e-01, 0.00000000e+00],
       [2.23300000e+03, 5.04880000e-01, 0.00000000e+00],
       [2.23400000e+03, 0.00000000e+00, 0.00000000e+00]])
  ```
- col  
  
  Vertex colors in 0 to 1.
  ```
  >>> col
  array([[0.37647059, 0.35686275, 0.32156863],
       [0.36078431, 0.33333333, 0.30980392],
       [0.42352941, 0.40392157, 0.36078431],
       ...,
       [0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        ]])
  ```
- ind
  
  The index shows which three points form a triangle.
  ```
  >>> ind
  array([[      0,       3,       4],
       [      0,       4,       1],
       [      1,       4,       5],
       ...,
       [7877844, 7878551, 7877845],
       [7877845, 7878551, 7878552],
       [7877845, 7878552, 7877846]], dtype=int64)
    ```
- offsets

  Offset values for vertex coordinates. You need to pass this to `sim_image` and `reverse_proj`.
  ```
  >>> offsets
  array([7.31942032e+05, 2.15609204e+03, 4.04854197e+06])
  ```
Next, render a simulated landscape image. You can optionally use `min_distance` to mask pixels closer than a specified distance (useful for preventing mismatches with near-field objects).
```python
sim = sim_image(vert, col, ind, params, offsets, min_distance=100)  # mask closer than 100m
cv2.imwrite("sim_initial.png", sim)
```
![](_static/initial.png)

Every pixel in this image has geographic coordinates. Then you can get a table of image coordinates and geographic coordinates of it.
```python
df = reverse_proj(sim, vert, ind, params, offsets)
```
```
>>> df
             u     v            x           y            z      B      G      R
2058832   3376   366  734200.3125  4050691.75  2988.827881  116.0  120.0  124.0
2058833   3377   366  734199.6875  4050691.75  2988.624268  106.0  110.0  113.0
2058834   3378   366  734198.7500  4050691.25  2988.337402   82.0   86.0   88.0
2058835   3379   366  734198.0000  4050691.25  2988.081543   70.0   75.0   78.0
2058836   3380   366  734197.3750  4050691.25  2987.862061   60.0   65.0   68.0
...        ...   ...          ...         ...          ...    ...    ...    ...
21026299  5611  3743  732740.3125  4051161.75  2453.355469  113.0  117.0  148.0
21026300  5612  3743  732740.3125  4051161.75  2453.355469  113.0  117.0  148.0
21026301  5613  3743  732740.3125  4051161.75  2453.355713  113.0  117.0  148.0
21026302  5614  3743  732740.3125  4051161.75  2453.355713  113.0  117.0  148.0
21026303  5615  3743  732740.3125  4051161.75  2453.355713  113.0  117.0  148.0

[17336750 rows x 8 columns]
```

## Finding Ground Contorol Points
Then, you can add some Ground Contorol Points (GCPs) in the target image by matching target image and simulated image.

```python
path_org = "target_image.jpg"
path_sim = "sim_initial.png"
```

### Recommended Matching Methods

#### 1. SIFT (Lightweight, No Extra Dependencies)
[SIFT](https://docs.opencv.org/3.4/da/df5/tutorial_py_sift_intro.html) is a classic feature matching method that works with the core modules only. It's lightweight and suitable for images with similar resolution and good texture.

```python
match, plot = image_match(path_org, path_sim, method="sift", plot_result=True)
cv2.imwrite("matched.png", plot)
gcps = set_gcp(match, df)
```

#### 2. SuperPoint-LightGlue (Robust to Resolution Differences, Lightweight)
[SuperPoint](https://arxiv.org/abs/1712.07629) with [LightGlue](https://arxiv.org/abs/2306.13643) provides robust matching even when the simulated image and real photograph have different resolutions. It's relatively lightweight and fast.

Requires the `vismatch` package:
```bash
pip install alproj[vismatch]
```

```python
match, plot = image_match(path_org, path_sim, method="superpoint-lightglue", plot_result=True)
cv2.imwrite("matched.png", plot)
gcps = set_gcp(match, df)
```

#### 3. MiniMa-RoMa (High Performance, Computationally Heavy)
[MiniMa-RoMa](https://arxiv.org/abs/2305.15404) is a dense matching method that achieves excellent performance on challenging image pairs. It's computationally heavy but provides the best matching quality for difficult cases.

Requires the `vismatch` package:
```bash
pip install alproj[vismatch]
```

```python
match, plot = image_match(path_org, path_sim, method="minima-roma", plot_result=True)
cv2.imwrite("matched.png", plot)
gcps = set_gcp(match, df)
```

### Comparison of Matching Methods

The table below shows the comparison of all available methods on a 5616x3744 pixel image pair (CPU on M4 Pro MBP).

**Built-in methods (no extra dependencies):**

| Method | Time | Matches | Notes |
|--------|------|---------|-------|
| akaze | ~1 sec | 176 | Fastest, fewer matches |
| sift | ~4 sec | 331 | Good balance for simple cases |

**LightGlue-based methods (lightweight, handles full resolution):**

| Method | Time | Matches | Notes |
|--------|------|---------|-------|
| sift-lightglue | ~3 sec | 518 | SIFT features + LightGlue matcher |
| superpoint-lightglue | ~4 sec | 962 | Recommended for most cases |
| aliked-lightglue | ~6 sec | 815 | ALIKED features + LightGlue matcher |
| minima-superpoint-lightglue | ~29 sec | 973 | MiniMa preprocessing + SuperPoint |

**Semi-dense methods:**

| Method | Time | Matches | Notes |
|--------|------|---------|-------|
| xfeat-star | ~14 sec | 112 | XFeat semi-dense matching |

**Dense matching methods (high match count, auto-resized to 640px):**

| Method | Time | Matches | Notes |
|--------|------|---------|-------|
| tiny-roma | ~2 sec | 2047 | Fast dense matching |
| loftr | ~2 sec | 2198 | Good for low-texture regions |
| minima-loftr | ~3 sec | 2629 | MiniMa + LoFTR |
| ufm | ~10 sec | 2048 | UFM matching |
| rdd | ~14 sec | 1494 | RDD matching |
| roma | ~23 sec | 2048 | RoMa dense matching |
| master | ~327 sec | 3515 | MASTER matching (large model) |
| minima-roma | GPU only | - | Best quality, large model requires CUDA |
| minima-roma-tiny | ~2 sec (est.) | - | CPU-compatible variant of MiniMa-RoMa |

**Note:** Dense matching methods (non-LightGlue) automatically resize images to 640px when `resize` is not specified to prevent out-of-memory errors. Keypoints are scaled back to original coordinates.

You can reproduce this comparison using the script `scripts/compare_matching_methods.py`.

**SIFT result:**
![](_static/matched_sift.png)

**SuperPoint-LightGlue result:**
![](_static/matched_superpoint_lightglue.png)

**MiniMa-RoMa result:**
![](_static/matched_minima_roma.png)

For most cases, **SuperPoint-LightGlue** provides a good balance between speed and robustness. Use **SIFT** when you cannot install additional dependencies, or **MiniMa-RoMa** when you need the highest matching quality for difficult image pairs.

### Outlier Filtering

By default, `image_match()` applies Fundamental Matrix filtering (`outlier_filter="fundamental"`).
You can also use Essential Matrix filtering when camera parameters are available:

```python
# Essential Matrix filtering (recommended when params with fov is available)
match, plot = image_match(
    path_org, path_sim,
    method="minima-roma",
    outlier_filter="essential",  # "essential", "fundamental", or "none"
    params=params,               # camera params with fov, w, h (focal length computed automatically)
    threshold=10.0,              # MAGSAC threshold in pixels
    plot_result=True
)
```

**Outlier filtering methods:**
- `"fundamental"`: Fundamental Matrix with MAGSAC++ (default, no camera intrinsics required)
- `"essential"`: Essential Matrix with MAGSAC++ (recommended when `params` with `fov` is provided)
- `"none"`: No filtering. Use this when you plan to apply custom filtering later.

### Spatial Thinning

When using dense matching methods like MiniMa-RoMa, matches may cluster in certain regions of the image. Use spatial thinning to ensure uniform distribution:

```python
# Match with spatial thinning (keeps at most 1 point per 100x100 pixel region)
match, plot = image_match(
    path_org, path_sim,
    method="minima-roma",
    spatial_thin_grid=50,           # Grid cell size in pixels
    spatial_thin_selection="center", # "first", "random", or "center"
    device="cuda",
    plot_result=True
)
```

![](_static/matched_minima_roma_thined.png)

**Selection methods:**
- `"first"`: Keeps the first point by input order (fastest, deterministic)
- `"random"`: Random selection (use `spatial_thin_random_state` for reproducibility)
- `"center"`: Keeps the point closest to the cell center (best for uniform distribution)

Spatial thinning is applied AFTER geometric outlier filtering, so it samples from inliers only.

### Distance-based GCP Filtering

After creating GCPs with `set_gcp()`, you can filter them based on 3D distance from the camera:

```python
from alproj.gcp import image_match, set_gcp, filter_gcp_distance

# Create GCPs from matches
gcps = set_gcp(match, df)

# Filter: exclude points closer than 100m or farther than 2000m
gcps = filter_gcp_distance(gcps, params, min_distance=100, max_distance=2000)
```

This is useful for:
- Excluding nearby foreground objects (min_distance)
- Excluding distant features with poor depth accuracy (max_distance)

**Important:** Coordinates must be in a projected CRS (e.g., UTM) for accurate Euclidean distance. Using lat/lon directly will produce incorrect results.

### All Available Methods
**Built-in (no extra dependencies):**
- **akaze**: [AKAZE](https://docs.opencv.org/3.4/d0/de3/citelist.html#CITEREF_ANB13) local features
- **sift**: [SIFT](https://docs.opencv.org/3.4/da/df5/tutorial_py_sift_intro.html) local features

**With vismatch package (pip install alproj[vismatch]):**

Any method supported by [vismatch](https://github.com/gmberton/vismatch) (70+ models) can be used. Some popular choices:

- **sift-lightglue**, **superpoint-lightglue**, **disk-lightglue**, **aliked-lightglue**, **dedode-lightglue**: LightGlue-based (lightweight, handles full resolution)
- **xfeat**, **xfeat-lightglue**, **xfeat-star**: XFeat variants (fast and lightweight)
- **roma**, **romav2**, **tiny-roma**, **minima-roma**: RoMa variants (dense matching)
- **loftr**, **eloftr**, **minima-loftr**: LoFTR variants (good for low-texture regions)
- **ufm**, **rdd**, **master**, **duster**, **edm**: Other dense matching methods

See the [vismatch README](https://github.com/gmberton/vismatch) for the full list.

```
>>> gcps
         u     v              x             y            z
0     1468  2751  733134.120287  4.051270e+06  2367.014130
1     2362  1271  733733.878100  4.051172e+06  2613.651184
2     5116  2709  732801.371752  4.051120e+06  2441.953796
3      644  2253  733323.715014  4.051434e+06  2401.383240
4     3944   953  733846.427660  4.050684e+06  2738.904907
...    ...   ...            ...           ...          ...
1122  2850  1748  733444.243822  4.051082e+06  2480.453796
1123  3950  1132  733707.379809  4.050744e+06  2655.842163
1124  1043  1969  733419.895678  4.051407e+06  2440.836182
1125   750  1765  733541.844896  4.051506e+06  2479.430786
1126  3454   824  733863.009447  4.050836e+06  2778.318420

[1127 rows x 5 columns]
```
Where `u` and `v` stands for the x and y axis coordinates in the image coordinate system.

## Optimization of Camera Parameters
Finally, optimizing camera parameters using GCPs.
Camera parameters are optimized by minimizing [reproection errors](https://support.pix4d.com/hc/en-us/articles/202559369-Reprojection-error).
You can specify which parameters to be optimized, including camera position (x, y, z).

Two optimizers are available:

### CMAOptimizer (Recommended for Global Search)

`CMAOptimizer` uses [CMA-ES](https://github.com/CyberAgent/cmaes) evolution strategy.
Good for difficult cases when the initial guess is far from optimal.
Supports Huber loss (`f_scale` parameter) for robustness against outliers.

```python
obj_points = gcps[["x","y","z"]]
img_points = gcps[["u","v"]]

# Phase 1: Optimize position, orientation, fov, and aspect ratio
cma_optimizer = CMAOptimizer(obj_points, img_points, params)
cma_optimizer.set_target(["x", "y", "z", "fov", "pan", "tilt", "roll", "a1", "a2"])
params_optim, error = cma_optimizer.optimize(
    generation=300,
    sigma=1.0,
    population_size=50,
    f_scale=10.0  # Huber loss threshold in pixels (robust to outliers)
)
print("Error:", error)
```

After Phase 1, you can perform image matching again with the improved parameters and optimize distortion parameters:

```python
# Phase 2: Optimize distortion parameters with refined matches
cma_optimizer = CMAOptimizer(gcps[["x","y","z"]], gcps[["u","v"]], params_optim)
cma_optimizer.set_target(["k1", "k2", "k3", "k4", "k5", "k6", "p1", "p2", "s1", "s2", "s3", "s4"])
params_optim, error = cma_optimizer.optimize(
    generation=300,
    sigma=1.0,
    population_size=50,
    f_scale=10.0
)
```

**Key parameters:**
- `generation`: Number of generations to run (more = better convergence, slower)
- `sigma`: Initial standard deviation in normalized [0, 1] space
- `population_size`: Number of candidate solutions per generation
- `f_scale`: Huber loss threshold in pixels (if None, uses RMSE)
- `bound_widths`: Width from initial value for each parameter (default: ±45° for angles, ±30m for position, ±0.2 for distortion)

### LsqOptimizer (Fast Local Refinement)

`LsqOptimizer` uses scipy's `least_squares` with Trust Region Reflective algorithm.
Much faster than CMA-ES but requires a good initial guess. Supports robust loss functions.

```python
lsq_optimizer = LsqOptimizer(obj_points, img_points, params)
lsq_optimizer.set_target(["k1", "k2", "k3", "k4", "k5", "k6", "p1", "p2", "s1", "s2", "s3", "s4"])
params_optim, error = lsq_optimizer.optimize(
    method="trf",        # "trf", "dogbox", or "lm"
    loss="huber",        # "linear", "huber", "soft_l1", "cauchy", "arctan"
    f_scale=10.0,        # threshold for robust loss functions
    max_nfev=1000        # maximum number of function evaluations
)
```

**Key parameters:**
- `method`: Algorithm to use ("trf" recommended, "lm" does not support bounds/robust loss)
- `loss`: Loss function ("huber" and "soft_l1" are robust to outliers)
- `f_scale`: Soft threshold for residuals (larger = more tolerance to outliers)

```
The optimized camera parameters reproduces the target image exactly.
```python
sim_optim = sim_image(vert, col, ind, params_optim, offsets)
cv2.imwrite("sim_optimized.png", sim_optim)
```
![](_static/optimized.png)

## Reverse Projection

Now you can get geographic coordinates of each pixel of the target image.
```python
original = cv2.imread("target_image.jpg")
georectified = reverse_proj(original, vert, ind, params_optim, offsets)
```

```
>>> georectified
             u     v              x             y            z      B      G      R
3030570   3546   539  734196.643725  4.050693e+06  2987.922119  193.0  153.0  128.0
3030571   3547   539  734195.270678  4.050693e+06  2987.445068  195.0  155.0  130.0
3030572   3548   539  734193.905932  4.050693e+06  2986.971313  192.0  154.0  124.0
3030573   3549   539  734192.899340  4.050693e+06  2986.625610  187.0  149.0  119.0
3030574   3550   539  734192.235033  4.050693e+06  2986.404175  186.0  149.0  115.0
...        ...   ...            ...           ...          ...    ...    ...    ...
20655647  5615  3677  732743.775072  4.051166e+06  2452.745117  130.0  174.0  191.0
20661260  5612  3678  732743.765063  4.051166e+06  2452.746643  115.0  161.0  179.0
20661261  5613  3678  732743.764086  4.051166e+06  2452.746887  115.0  161.0  179.0
20661262  5614  3678  732743.764086  4.051166e+06  2452.746887  119.0  165.0  182.0
20661263  5615  3678  732743.762987  4.051166e+06  2452.747192  124.0  170.0  187.0

[15678803 rows x 8 columns]
```

## Exporting to GeoTIFF

You can convert the reverse projection output directly to a GeoTIFF using the built-in `to_geotiff()` function:

```python
from alproj.project import to_geotiff

# Convert to GeoTIFF with automatic rasterization and interpolation
to_geotiff(
    georectified,
    "output.tif",
    resolution=1.0,           # Pixel resolution in coordinate units (e.g., meters)
    crs="EPSG:6690",          # Coordinate Reference System
    bands=["R", "G", "B"],    # Which columns to use as bands
    interpolate=True,         # Fill small gaps using focal statistics
    max_dist=1.0,             # Maximum interpolation distance
    agg_func="mean",          # Aggregation function: "mean", "median", "max", "min"
    nodata=255                # NoData value for missing pixels
)
```

You can also export to CSV for use with other tools:
```python
georectified.to_csv("georectified.csv", index=False)
```

Result Plot

![](_static/ortho.png)