Skip to main content

mapv10 Generator

This document covers the Rust generator process: how to invoke it, how the pipeline orchestrator threads stages and products together, what each numbered stage actually computes, and where the determinism and precision boundaries live. The high-level system diagram, the workspace layout, the scale-preset table, and the renderer-side contract live in architecture.md. This file does not duplicate that material; it focuses on generator internals. The 4-step wave dispatch protocol that governs changes to this stack lives in wave-protocol.md; the visual regression fixtures and per-preset perf budgets live in scenarios.md; the recipe for adding new generator stages or wiring new products into the renderer lives in extending.md.

Build and Run

The generator is a Rust 2021-edition crate at examples/map/mapv10/generator/ with mapv10-generator as the binary target. Build it directly with cargo:

cargo build --release --manifest-path examples/map/mapv10/generator/Cargo.toml

The CLI is parsed by generator/src/main.rs. All flags are optional; running with no flags falls through to defaults.

FlagTypeDefaultPurpose
--output <dir>path(none)Write the run directly to this directory
--run-id <id>stringderived from --output basename or generatedOverride the run id baked into manifest.json
--forceboolfalseAllow --output to replace an existing directory
--scale-preset <id>enumprovince-slice (main.rs)One of province-slice, regional-slice, realm-slice, continent
--seed <u64>u6420260502 (config.rs)Deterministic seed
--world-width-km <f64>f64preset valueOverride preset width
--world-height-km <f64>f64preset valueOverride preset height
--raster-width <usize>usizepreset valueOverride heightfield width in cells
--raster-height <usize>usizepreset valueOverride heightfield height in cells

When --output <dir> is supplied, the run is written directly into that directory. prepare_output at main.rs first checks existence: if the directory exists and --force is not set the build aborts; if --force is set the directory is removed and recreated.

When --output is omitted, the binary writes the run into a working subdirectory of ../artifacts/mapv10/ named .building-<unix-millis>-seed-<n>/ (main.rs), then atomically renames it to run-<unix-millis>-seed-<n>-<scale-preset>/ after every stage has written successfully (main.rs). This means an interrupted run never appears at the canonical name; only fully written runs are visible. --force has no effect in this default- path mode because the destination name is unique per invocation.

The viewer ships a thin Node script at viewer/scripts/bootstrap-continent-fixture.mjs that wraps the cargo invocation and the post-generate Valenar validation. Running:

npm run fixture:continent --prefix examples/map/mapv10/viewer

invokes cargo run --release -- --scale-preset continent --output viewer/public/continent-lod6 --run-id continent-lod6 --force (bootstrap-continent-fixture.mjs), then calls validateValenarExport(runRoot) to confirm stage 15 emitted schema-valid valenar/world-<seed>.json and valenar/world-<seed>.mesh.json products. architecture.md § Workspace Layout documents how viewer/public/<run-id>/ is served at /mapv10/runs/<run-id>/ by the dev (5443) and preview (4279) servers.

Pipeline Architecture

The orchestrator entry point is build_products(config, run_dir) at pipeline.rs. It instantiates a single BuildRegistry (pipeline.rs) that accumulates the run's outputs:

  • config: GeneratorConfig — the resolved scale preset and overrides.
  • products: Vec<ProductRef> — every truth product written to disk.
  • previews: Vec<PreviewRef> — every PNG preview from stage 14.
  • stages: Vec<StageManifestRef> — per-stage stage-manifest.json references.
  • validation_reports: Vec<(String, ValidationReport)> — keyed by <id>-<key>.

Stages run sequentially on the main thread. Each stage is wrapped in the local StageWrite struct (pipeline.rs):

struct StageWrite<'a> {
id: u8,
key: &'a str,
name: &'a str,
description: &'a str,
dependencies: Vec<String>,
products: Vec<ProductRef>,
previews: Vec<PreviewRef>,
validation: ValidationReport,
contract: serde_json::Value,
}

add_stage (pipeline.rs) folds the wrapper into the registry: it extends the running products and previews vectors, appends the validation report keyed by <id>-<key>, then calls write_stage_files which lays out the per-stage subdirectory <NN-key>/ and emits a stage-index.json describing that stage's declared products, previews, dependencies, validation status, and contract value. Validation runs per stage but failures only append to validation_reports; the orchestrator does not abort on a failed stage report.

After the final stage has been added, write_run_envelope (pipeline.rs, defined in artifacts.rs) writes the run-level manifest.json with the mapv10-artifacts-v1 schema version (lib.rs), the consolidated stage index, and the generation metadata.

The pipeline itself is single-threaded. The only intra-pipeline parallelism is internal to stage 3 (erosion), which uses rayon double-buffered row-parallel updates inside the inner iteration loop. Stages are not run in parallel with each other.

Stage Reference

The pipeline emits 16 stages numbered 0..15. The full stage table lives in architecture.md § Generator Pipeline; this section describes what each stage actually computes.

Stage 0 — config

Validates the resolved GeneratorConfig and serializes it as the canonical config.json truth product (stages/config.rs).

  • Technique: parameter validation. No algorithm.
  • Key parameters: scale preset, seed, world bounds, raster dims, elevation/slope ranges.
  • Algorithm: validate_config checks the preset is in the known set, that bounds and raster dims are positive, and that derived province / Location count bands are well-formed.
  • Products: config.json.
  • Depends on: nothing.

Stage 1 — continent

Generates the closed land polygon, the coastline polyline, and the sea region shell with the land as a hole (stages/continent.rs).

  • Technique: parametric ellipse with three sinusoidal lobe harmonics. Hand-authored, not noise-based, not tectonic.
  • Key parameters: center_x = world_width_km * 0.5, center_y = world_height_km * 0.52, radius_x = world_width_km * 0.39, radius_y = world_height_km * 0.37, samples = 40, phase = sin(seed as f64) (continent.rs).
  • Algorithm: 40 sample points on (angle = TAU * i / 40). The per-angle radius multiplier is `1 + 0.10 * sin(3a + phase) + 0.07
    • cos(5a + phase * 0.37) + 0.04 * sin(9a + phase * 1.9). Points are clamped to [0.04 W, 0.96 W] x [0.04 H, 0.96 H]` and the ring is closed with an explicit final point equal to the first.
  • Products: continentPolygons.json, coastlines.json, seaRegions.json.
  • Depends on: stage 0.

Stage 2 — geography-graph

Generates the deterministic ridge chain and basin polygons that bias the heightfield (stages/geography_graph.rs).

  • Technique: hand-authored parametric chain plus two parametric ellipses. No noise, no procedural sampling.
  • Key parameters: 6 ridge nodes (geography_graph.rs); each at t = i/5, x = W*(0.18 + t*0.66), y = H*(0.25 + t*0.42 + sin(t*TAU)*0.07), elevation_bias = 1050 + 260 * sin(t*PI). 5 ridge edges connecting consecutive nodes with width_km = W*(0.07 - i*0.003).max(0.035). 2 basins at fixed world fractions: (W*0.29, H*0.63) and (W*0.68, H*0.72) (geography_graph.rs), with depression bias 130 m and 190 m respectively.
  • Algorithm: nodes and basins are emitted directly from the closed-form parametric definitions; basin polygons are the 20-sample closed ellipses from closed_ellipse(center, rx, ry, 20).
  • Products: ridgeGraph.json, basins.json.
  • Depends on: stage 1.

Stage 3 — heightfield

Computes the eroded elevation raster and its derivatives. This is the most expensive stage, the source of every downstream raster, and the only stage with internal parallelism (stages/heightfield/mod.rs, stages/heightfield/erosion.rs).

  • Technique: rotated-gradient fBm noise generation (Perlin-style with per-octave rotation) over a continental macro-relief template, biased by ridge and basin influence fields, followed by Mei-Decaudin-Hu virtual-pipe hydraulic erosion (Mei et al. 2007) with Šťava-Beneš sediment transport (Šťava et al. 2008) and isotropic talus thermal relaxation also from Šťava 2008.

  • Key parameters (from erosion.rs):

    • hydraulic_iterations = 4_000
    • thermal_every = 10 (every Nth hydraulic iteration runs one thermal pass)
    • dt = 0.02 seconds
    • rainfall_rate = 0.012 m/s
    • pipe_area_base = 20.0 m^2 (linearly scaled by cell pitch)
    • gravity = 9.81 m/s^2
    • capacity_kc = 0.10
    • dissolving_ks = 0.5
    • deposition_kd = 1.0
    • evaporation_ke = 0.015 1/s
    • min_water_for_capacity = 0.01 m
    • talus_angle_radians = PI * 33 / 180 (33 degrees)
    • thermal_max_move_per_step_meters = 1.0
    • dx_reference_meters = 1.0
  • Algorithm: the noise pass at mod.rs walks every cell at (col + 0.5, row + 0.5) * cell_size and combines: inland_uplift = 35 + smoothstep(12, 280, coast_distance) * 390, continental_macro_relief (a TERRAIN_MACRO_NOISE_OCTAVES fBm scaled by interior smoothstep with directional tilts and lowland features), ridge influence (per-edge gaussian field jagged by RIDGE_JAGGED_NOISE_OCTAVES fBm plus per-node gaussian peaks), basin influence (subtracted), and detail relief (TERRAIN_DETAIL_NOISE_OCTAVES scaled by 58 m). Off-land cells get a shelf falloff of -160 - min(coast_distance * 11, 850) m. Every cell is clamped into elevation_range_meters = [-1200, 3200] before erosion runs.

    Erosion runs in-place on the height raster (erosion.rs). Each of the 4000 iterations: rain, flux update, water update, sediment transport, and (every 10th iteration) thermal relaxation. The flux update uses the canonical pipe-model Q = max(prev + dt * A * g * dh / l, 0) with absorbing boundaries on the world edge, then volume-conservation rescales outflow so the total per-step transfer never exceeds the cell's available water. State is held in two read/write buffers per field (read_water/write_water, read_flux/write_flux, read_sediment/write_sediment) swapped via mem::swap at end-of-iteration. Inner loops use rayon par_chunks_mut(width).enumerate() so each row computes against the immutable read_* buffers in parallel without races. After erosion the height field is re-clamped to elevation_range_meters (mod.rs) and slope and normals are computed by compute_slope (Horn 1981 central-difference slope magnitude) and compute_normals (centered-difference normals encoded as RG16).

  • Products: height.f32.bin, slope.f32.bin, normal.rg16.bin, sediment.f32.bin, flowAccumulation.f32.bin.

  • Depends on: stages 1, 2.

Stage 4 — water

Generates rivers and lakes (stages/water.rs).

  • Technique: hand-authored network. Lake polygons are closed-ellipse fits over basin centroids. River nodes and edges are explicit fixed-position lookups from world fractions and ridge-graph node indices.
  • Key parameters: 7 river nodes at fixed positions (water.rs): two mountain sources tied to ridge nodes 1 and 4, two lake confluences, one mid-river confluence at (W*0.46, H*0.55), two sea mouths at (W*0.105, H*0.67) and (W*0.70, H*0.91). 5 river edges with hand-set widths 0.62, 0.86, 0.54, 0.74, 0.92 km and Strahler-style orders 2, 3, 2, 3, 4 (water.rs). Lake radii are world fractions (water.rs): 0.045 W x 0.036 H for lake 0, 0.058 W x 0.047 H for lake 1.
  • Algorithm: lakes are 28-sample closed_ellipse rings around basin centroids. Surface elevation is sampled from the eroded heightfield at the centroid. Rivers are the explicit edge specs above; centerlines are direct straight segments between node points (no path-finding, no D8 routing).
  • Products: riverGraph.json, riverCenterlines.json, lakePolygons.json, waterMask.u8.bin, riverWidth.f32.bin.
  • Depends on: stages 1, 2, 3.

Stage 5 — biomes-materials

Classifies every land cell into a biome label and emits two RGBA8 material-weight bands plus binary forest and wetland masks (stages/biomes_materials.rs).

  • Technique: deterministic per-cell classifier driven by the eroded height, slope, water mask, an ecological fBm noise field, and distance-to-coast / distance-to-water terms.
  • Key parameters: ECOLOGICAL_NOISE_OCTAVES (4 octaves at 320/170/92/48 km wavelengths, biomes_materials.rs). Biome IDs 0..=6: sea, plains, forest, highlands, snow, freshwater, wetland.
  • Algorithm: per-cell, the classifier picks a biome label by threshold on elevation, slope, ecological noise, and proximity to water and coast, then emits soft membership weights into bands A and B. Band A channels are grass/plains (R), rock/highland (G), forest canopy (B), and snow or wetland (A — the GPU NEAREST biome sample picks between the two palettes per fragment). Band B channels are sand/coast (R), bare earth (G), ice/glacier (B), riverbank mud (A). All eight membership channels are continuous smooth values so they survive mean-downsampling through the LOD pyramid without labyrinth artefacts.
  • Products: biome.u8.bin, materialWeights.rgba8.bin, materialWeightsB.rgba8.bin, forestMask.u8.bin, wetlandMask.u8.bin.
  • Depends on: stages 1, 3, 4.

Stage 6 — political

Generates the realm, province, and Location polygon hierarchy plus the per-cell ID rasters and the neighbor graph (stages/political.rs).

The polygon stage emits every realm / province / location with a deliberately-empty name field; the dedicated political_naming stage (see "Naming", below) runs immediately after political polygons are emitted and fills those names in from the per-biome procedural namer. The pipeline order is biomes_materialspolitical::generate (polygon shapes only, names empty) → political_naming::run (samples biome raster, fills names), so the namer always has both the polygons and the biome classification available without the polygon stage needing to know anything about biomes. The legacy 120-name location_name pool and the "Province N" placeholder fallback have been deleted entirely; no fallback path lives in the polygon stage.

  • Technique: Halton-sequence candidate seeding, Mitchell best-candidate selection (Mitchell 1991 with deterministic jitter) for blue-noise spacing, single-pass Lloyd relaxation, and half-plane Voronoi cell clipping over the bounding polygon.

  • Key parameters: candidate pool size `(count * 16).max(count

    • 32)with up totarget_candidates * 80 Halton draws (political.rs). Lloyd relaxation iterations = 1 (political.rsandpolitical.rs`).
  • Algorithm: relaxed_seed_points (political.rs) runs blue_noise_seed_points to draw Halton(2,3) candidates inside the bounding polygon, picks the best-candidate by squared distance to the existing selection (with a small splitmix64 jitter so ties break deterministically), then iterates Lloyd relaxation by computing the centroid of each Voronoi cell and replacing the seed if the centroid lies inside the container. voronoi_cell_polygons (political.rs) is naive O(N^2) half-plane clipping: for each seed, it starts with the open container ring and clips against every other seed using the perpendicular bisector. The clip equation is

    2 * (other.x - seed.x) * x + 2 * (other.y - seed.y) * y
    = other.x^2 + other.y^2 - seed.x^2 - seed.y^2

    (political.rs); a point is "inside" (kept) when a*x + b*y - c <= 1e-9. This is performed for the realm, then per-realm provinces, then per-province Locations. Province and Location ID rasters are filled by point-in-polygon classification at (col + 0.5, row + 0.5) * cell_size cell centres (political.rs raster ID fill helpers). The neighbor graph (political.rs) walks the ID rasters in scan order collecting raster-adjacency edges into a BTreeSet for deterministic ordering.

  • Products: realmPolygons.json, provincePolygons.json, locationPolygons.json, provinceId.u32.bin, locationId.u32.bin, neighborGraph.json, provinceColorSeeds.json.

  • Depends on: stages 0, 1, 3.

Stage 7 — routes

Generates the location-to-location connection graph, road network, canonical route graph, route centerlines, and crossing anchors where routes intersect rivers (stages/routes.rs).

  • Technique: pairwise candidate construction over neighbor graph edges, deterministic cost ranking, and selected centerline emission with crossing detection against river centerlines.
  • Algorithm: build_route_nodes materializes one node per location plus province hubs. Candidate edges come from the stage 6 neighbor graph; each candidate carries a deterministic cost combining great-arc distance, terrain cost multiplier from the heightfield slope along the path, and the count of river crossings. The selected route graph is connected by construction.
  • Products: locationConnections.json, roadNetwork.json, routeGraph.json, routeCenterlines.json, crossingAnchors.json.
  • Depends on: stages 3, 4, 6.

Stage 8 — map-features

Emits map feature anchors, footprints, and label anchors (stages/map_features.rs).

  • Technique: deterministic anchor placement over the political hierarchy and routes plus zoom-band metadata for the renderer's label fade and collision system.
  • Algorithm: generate_labels walks realms, provinces, Locations, water, and routes, emitting one LabelAnchor per entity with a zoom-band rank derived from the entity kind. Feature footprints are closed polygon rings tied to declared anchor IDs.
  • Products: mapFeatureAnchors.json, mapFeatureFootprints.json, labelAnchors.json.
  • Depends on: stages 4, 6, 7.

Stage 9 — influence

Builds generic influence truth products, with corruption as the first registered preset (stages/influence.rs).

  • Technique: typed influence registry + deterministic source anchors + per-type intensity masks. Influence never mutates the base biome/material truth products; it emits derived effective visual/material products for rendering and inspection.
  • Algorithm: source anchors are selected from generated map features (wetland, mountain pass, lake shore) with a deterministic Location fallback. Corruption intensity combines smooth radial falloff, organic edge noise, route drainage, slope gain, and water damping into influenceMask.corruption.u8.bin. influenceTypeMask stores 0 for none and the registered type id for active influence. Effective visual/material masks are derived from the base biome, material weights, forest mask, and wetland mask.
  • Products: influence/influenceTypes.json, influence/influenceSources.json, influence/influenceRules.json, influence/influenceMask.corruption.u8.bin, influence/influenceTypeMask.u8.bin, effective/effectiveVisualBiome.u8.bin, effective/effectiveMaterialWeights.rgba8.bin, effective/effectiveMaterialWeightsB.rgba8.bin, effective/effectiveForestMask.u8.bin, effective/effectiveWetlandMask.u8.bin.
  • Depends on: stages 3, 4, 5, 6, 7, 8.

Stage 10 — borders-sdf

Bakes a continent-wide signed-distance field over province IDs plus a 24-bit nearest-province ID channel, encoded as RGBA8 (stages/borders_sdf.rs).

  • Technique: 1+JFA — Jump Flooding Algorithm (Rong & Tan, 2006) with one corrective step = 1 pass appended after the standard log2-step passes to restore full Voronoi accuracy.
  • Key parameters: BORDER_SDF_RADIUS_CELLS = 127 (borders_sdf.rs) — the saturation radius of the distance channel.
  • Algorithm: seed_border_cells marks every cell whose province ID differs from at least one of its 4 von Neumann neighbours as a JFA seed at distance zero, carrying the OTHER side's province ID. The main loop computes step = max(width, height).next_power_of_two() / 2 and halves step each pass, running jfa_pass over the current/next buffers and swapping. After the final step-1 pass of the standard schedule, one additional jfa_pass(width, height, 1, ...) corrective pass runs (1+JFA, borders_sdf.rs). The encoded RGBA8 channels are: R = clamp(d, -127, 127) + 128 (signed distance in source cells, positive inside province, negative outside), G = low byte of nearest-province numericId, B = middle byte, A = high byte. The 24-bit ID encoding supports up to 16,777,215 unique provinces; nearest-id zero across all three bytes means "no nearest within search radius".
  • Products: borderSdf.rgba8.bin.
  • Depends on: stage 6.

Stage 11 — tile-pyramid

Slices the source rasters into per-LOD per-tile assets with skirts plus the per-tile vector / semantic / mesh manifests (stages/tile_pyramid.rs).

  • Technique: per-channel downsampling with a 1-cell skirt. Continuous channels (height, slope, RGBA8 material weights, forest/wetland coverage masks, SDF distance) use box-mean. Categorical channels (biome u8, province ID u32, location ID u32, SDF nearest-ID GBA bytes) use mode (plurality) downsampling. Slope is re-computed per LOD via Horn 1981 central-difference rather than mean-downsampled. The Toksvig closed-form roughness prefilter (Han 2007 / Olano-Baker 2010) is computed per LOD into a u8 sidecar. Stage 11 also emits the generated RGBA8 closeDetailNormal sidecar; z6/z7 use closeDetailScale = 8 so close-detail raster SSE is backed by authored generated truth rather than a viewer-only noise shader.
  • Key parameters: BORDER_CELLS = 1 (tile_pyramid.rs) — one row of skirt sampled from the SOURCE raster's neighbour cells, not from the next tile's already-downsampled bytes. The output tile extent is (W + 2N) x (H + 2N) per channel.
  • Algorithm: zoom levels come from the preset's LOD ladder (tile_pyramid.rs). For each zoom, the world is divided into tile_count_x x tile_count_y axis-aligned tiles (tile_pyramid.rs). For each tile, a RasterWindow identifies the source-cell range, then the per-channel writers sample with a half-output-cell offset into the source raster and emit the padded extent. Mode downsampling counts source cell values inside each output cell footprint and picks the plurality; it is the only sound choice for a u32 ID raster because mean produces meaningless intermediate IDs. Toksvig roughness encodes alpha = sqrt(2 * (1 - len(mean(N))) / len(mean(N))) as clamp(round(alpha * 255), 0, 255) per output texel (tile_pyramid.rs); the prefilter reads the next-finer LOD's unit normals because variances must add. The z7 normalRoughness sidecar is emitted all-zero because there is no finer LOD to prefilter from at source resolution.
  • Products: tiles/tile-pyramid.json, tiles/tile-coordinate-index.json, tiles/raster-tiles/index.json, tiles/vector-tiles/index.json, tiles/semantic-tiles/index.json, tiles/mesh-tiles/index.json, required z-row family shards under tiles/<family>-tiles/z<z>/y<y>.json, tiles/semantic-display-policy.json, plus per-tile binary assets under tiles/<channel>/z<z>/<x>/<y>.bin. The coordinate index owns generated tile bounds/hierarchy/error truth; family manifest indexes own shard byte lengths so browser and Node proof paths stream row shards instead of reading a multi-GB monolithic tile manifest. Base forestMask and wetlandMask are included as runtime raster tile sidecars for inspector payloads; the viewer does not fetch their full-resolution root products during default run load.
  • Depends on: stages 3, 5, 6, 9, 10.

Stage 12 — meshes

Triangulates terrain, water, and route geometry into f32-LE position

  • u32-LE index .mesh.bin files plus the family manifests (stages/meshes.rs).
  • Technique: per-tile heightmap-to-grid triangulation for terrain. Earcutting (via the earcutr crate) for sea regions and lake polygons, with land polygons as holes for the sea mesh. Quad-strip ribbon construction for routes from polyline centerlines.
  • Algorithm: terrain tiles iterate tile_coordinates(config) and emit a regular grid of triangulated quads spanning the tile's source window at the tile's sample_step. Vertex elevations are read from the eroded height.f32.bin and converted from metres to kilometres via * 0.001 at write time (meshes.rs mesh vertex write path). Each mesh asset records its originKm = PointKm::new(tile.bounds_km.min_x, tile.bounds_km.min_y) (meshes.rs); vertex positions are stored tile-local so the f32 upload stays inside the precision budget.
  • Products: meshes/mesh-manifest.json, meshes/terrain-meshes.json, meshes/water-meshes.json, meshes/route-ribbons.json, plus per-asset .mesh.bin blobs.
  • Depends on: stages 1, 3, 4, 7, 11.

Stage 13 — Readiness

Computes the readiness report covering memory, cache, artifact-size, coordinate-precision, and streaming budgets, plus the continent preset draft when running a non-continent preset (stages/readiness.rs).

  • Technique: aggregation over the registry of products, previews, tiles, and mesh assets emitted by stages 0..12.
  • Key parameters: F32_UNIT_ROUNDOFF_FACTOR = 1.0 / 8_388_608.0 (readiness.rs), MAX_ALLOWED_F32_ROUNDOFF_KM = 0.005 (readiness.rs). Per-preset budgets come from the preset table at config.rs.
  • Algorithm: artifact size aggregates byte length over all products and previews; the streaming report counts overview vs detail mesh tiles; the coordinate precision report folds local_bounds_km over every mesh asset to find the maximum tile-local extent and reports max_local_extent_km * F32_UNIT_ROUNDOFF_FACTOR against the 0.005 km ceiling.
  • Products: readiness/readiness-report.json, readiness/continent-preset-draft.config.json.
  • Depends on: stages 11, 12.

Stage 14 — previews

Renders human-readable PNGs derived from the typed truth products (stages/previews.rs). Previews are derived artifacts only; they do not feed any downstream stage.

  • Technique: per-channel PNG encoder calls (png crate).
  • Algorithm: each preview reads its source raster from the registry and emits an 8-bit PNG colourized for the channel kind (heat ramp for height/slope/sediment/flow accumulation; categorical palette for biome and political IDs; binary mask shading for forest/wetland; layered overlay for the political composite).
  • Products: previews/height-preview.png, previews/slope-preview.png, previews/normal-preview.png, previews/sediment-preview.png, previews/flow-accumulation-preview.png, previews/water-preview.png, previews/biome-preview.png, previews/splat-preview.png, previews/forest-preview.png, previews/wetland-preview.png, previews/province-id-preview.png, previews/location-id-preview.png, previews/political-preview.png, previews/influence-corruption-preview.png, previews/influence-type-preview.png, previews/effective-visual-biome-preview.png, previews/effective-splat-preview.png.
  • Depends on: stage 8 (declared dependency); reads outputs of stages 3, 4, 5, 6, 9.

Stage 15 — valenar-worlddata

Projects the generator's typed truth into the Valenar import JSON plus a mesh manifest, and seals both with a shared SHA-256 content hash (stages/valenar_worlddata.rs).

  • Technique: deterministic projection over the political polygons, neighbor graph, heightfield, water, biomes, and routes.
  • Algorithm: builds the snake_case ValenarWorldDocument with regions, areas, provinces, locations, and anchors keyed by stable numeric IDs derived from stage 6, computes the SHA-256 of the canonical JSON encoding of the world payload, then writes the same hash into both the world and mesh documents so the viewer's Valenar export validator can confirm the pair was emitted from the same generator run. World file path: valenar/world-<seed>.json; mesh manifest path: valenar/world-<seed>.mesh.json (valenar_worlddata.rs).
  • Products: valenar/world-<seed>.json, valenar/world-<seed>.mesh.json. The schemas live under examples/map/mapv10/schema/valenar-world.schema.json and examples/map/mapv10/schema/valenar-world-mesh.schema.json.
  • Depends on: stages 6, 7, 8, 12, 14.

For the field-level export schema, unit ranges, validator surface, and content-hash construction, see ./export-contract.md.

Naming

Political naming runs as a dedicated stage between biomes_materials and water in the pipeline (pipeline.rspolitical_naming::run). It walks every realm, province, and location in the polygon-stage Political product, samples the biome raster at each polygon's label anchor, and emits a procedurally- generated per-biome name into the polygon's name field. The upstream political::generate deliberately leaves every name as String::new() so the naming stage is the single source of truth for political name strings.

Architecture

The namer follows a Stellaris-shaped per-region grammar (stages/naming.rs): each biome owns a stem bank and a suffix bank, and a deterministic splitmix64-mixed seed selects one of each plus an optional rare connector. Reference points: Civilization VI per-civ city name lists, CK3 per-culture authored token lists, EU4 / EU5 per-culture historical province names, Stellaris per-species and per- region token banks with format grammars, Old World's "uniqueness wins over naturalness" rule with ordinal disambiguators.

Composition grammar:

name = <stem> + [<connector>] + <suffix> [+ <disambiguator>]

Connectors (-of-, -on-, -by-) inject at ~30% probability per slot seeded by the namer mix. Disambiguators (" Lower", " Cross", " Hold", " Mark", " Watch", " Reach", " Old", " High") layer on only after the namer has exhausted re-rolls within the primary bank.

NamingContext

The namer is a pure function of NamingContext in generator/src/stages/naming.rs (see stages::naming::NamingContext). The struct carries the world seed, realm / province / location numeric IDs, the local location_index_within_province, the biome tag for the slot, and a reserved culture_tag: Option<&str> slot for future per-culture variation (currently always None). No global state and no non-determinism-prone hashing — every emission is reproducible from the same context.

Capacity guarantee

Each biome carries ≈80 stems × ≈40 suffixes ≈ 3,200 base candidates; across six biome tables that totals ≈19,200 base candidates. The namer's splitmix64 re-roll loop (BASE_REROLL_ATTEMPTS = 4096 per disambiguator layer) plus the connector multiplier (4 variants including "no connector") and disambiguator layers (1 base + 8 disambiguators = 9 layers) lifts the effective realm-scoped capacity to well over 160,000 distinct outputs. The namer_capacity_exceeds_continent_max_locations unit test in stages::naming::tests drives 160,000 synthetic contexts across 1,000 provinces × 6 biomes and asserts ≥ 160,000 unique strings.

Collision policy

On every emission the namer:

  1. Composes a candidate from the seed via compose_name.
  2. If the candidate collides with an already-emitted name in the realm-scoped BTreeSet, re-rolls the seed via splitmix64 up to BASE_REROLL_ATTEMPTS times within the same bank.
  3. If the primary bank cannot produce a unique candidate, the namer walks the DISAMBIGUATORS pool (8 entries), each layer running another BASE_REROLL_ATTEMPTS re-rolls of <base><disambiguator>.
  4. If every layer collides, the namer panic!s with the realm / province / location / biome context. There is no silent re-issue and no placeholder fallback.

This panic-on-exhaustion contract is the no-fallback semantic — the same one that governs every other required mapv10 artifact.

Pipeline split

The polygon stage (political::generate) and the naming stage (political_naming::run) divide responsibility cleanly:

ConcernOwner
Realm / province / location polygon shapespolitical::generate
numeric_id, province_numeric_id, realm_numeric_id assignmentpolitical::generate
province_id / location_id rasterspolitical::generate
neighbor_graph adjacencypolitical::generate
province_color_seeds LUTpolitical::generate
Realm / province / location name stringspolitical_naming::run
Per-biome routing of names via biome raster samplepolitical_naming::run
Realm-scoped collision detection and disambiguationpolitical_naming::run
Panic on disambiguator exhaustionpolitical_naming::run

The order in pipeline.rs is biomes_materials::generatepolitical::generatepolitical_naming::runvalidate_political_products. The polygon validator and every downstream consumer (map_features, valenar_worlddata, the viewer's LabelAnchors consumer) reads names that already reflect the naming stage's output.

Noise Stack

All procedural noise in the generator routes through one primitive: rotated_gradient_fbm at noise.rs. Each octave is summed at its own per-octave seed seed ^ splitmix64((octave + 1).wrapping_mul(0x9e3779b97f4a7c15)) and the fBm sum is amplitude-normalized to [-1, +1] by dividing by the sum of per-octave amplitudes. The lattice gradient noise itself samples one of 8 fixed unit gradients (±1, 0), (0, ±1), (±sqrt(2)/2, ±sqrt(2)/2) selected by lattice_hash(seed, ix, iy) and uses the quintic fade 6t^5 - 15t^4 + 10t^3 (noise.rs).

Four named octave tables drive the procedural surfaces:

  • TERRAIN_MACRO_NOISE_OCTAVES — 4 octaves at 940 / 520 / 280 / 150 km wavelengths with amplitudes 1.00 / 0.58 / 0.31 / 0.16 (heightfield/mod.rs). Used by continental_macro_relief to shape the cross-continent uplift / lowland template.
  • TERRAIN_DETAIL_NOISE_OCTAVES — 3 octaves at 140 / 74 / 38 km wavelengths with amplitudes 1.00 / 0.46 / 0.19 (heightfield/mod.rs). Used as the per-cell detail relief that gets multiplied by 58 m on land and as the per-node ridge-peak jitter modulator.
  • RIDGE_JAGGED_NOISE_OCTAVES — 3 octaves at 360 / 190 / 96 km wavelengths with amplitudes 1.00 / 0.52 / 0.24 (heightfield/mod.rs). Used as the along-edge jagging factor for ridge influence.
  • ECOLOGICAL_NOISE_OCTAVES — 4 octaves at 320 / 170 / 92 / 48 km wavelengths with amplitudes 1.00 / 0.55 / 0.30 / 0.16 (biomes_materials.rs). Used by the biome and material classifier as the ecological variation field.

The per-octave rotation (cos_theta, sin_theta) is what makes the output non-axis-separable: a noise test (noise.rs) explicitly verifies that horizontal-band and vertical-band differences span more than 0.01 across a 24-sample sweep.

What the noise stack does NOT do: there is no domain warping anywhere in the stack — sample positions feed straight into gradient_noise_2d per octave with no displacement. There is no ridged-multifractal — no abs() on octave outputs, no 1 - abs(noise) ridge folding. The fBm is pure amplitude-normalized summation.

Coordinate and Precision Boundary

The generator works in f64 km horizontally and f64 metres vertically inside pipeline.rs::build_products and every stage. The hand-off to the renderer is at stages/meshes.rs::push_vertex (meshes.rs) where each MeshBuffer position is downcast to f32 after rebasing to the mesh asset's originKm:

self.positions.push([
(vertex.world_x - self.tile.bounds_km.min_x) as f32,
vertex.elevation_km as f32,
(vertex.world_y - self.tile.bounds_km.min_y) as f32,
]);

Elevation is converted from metres to kilometres at the same write site by reading self.height[idx] as f64 * 0.001 into elevation_km (meshes.rs). Vertex positions are tile-local in kilometres on all three axes; the renderer reconstructs world-space by adding MeshAssetRef.originKm.

The f32 unit-roundoff budget at readiness.rs is 1 / 8_388_608 per kilometre with a max_allowed_roundoff_km = 0.005. The stage 13 readiness report folds local_bounds_km over every mesh asset to compute the maximum tile-local extent and verifies max_extent * (1 / 8_388_608) <= 0.005. For a continent-preset tile with worst-case local extent of roughly 38 km (continent raster spans 2 400 km / 64 tiles in x at z5 plus skirts), the reported f32 roundoff sits around 0.0000045 km — three orders of magnitude inside the budget.

For the high-level coordinate convention (top-left origin, Y-down rows, kilometres horizontal, metres elevation), see architecture.md § Coordinate System And Units.

Determinism Contract

The generator is byte-deterministic for any fixed (scale_preset, seed, raster_width, raster_height, world_width_km, world_height_km) tuple within a single architecture and Rust toolchain. The contract is enforced by a registered test.

Seed flow. The CLI --seed becomes config.seed: u64 (config.rs). Each stage XORs the base seed with a fixed salt before drawing noise. Examples from the heightfield stage: detail relief uses config.seed ^ 0xa6d0_84c5_31d2_4ef1 (heightfield/mod.rs); shelf relief uses config.seed ^ 0x54f8_d80f_f56d_1135 (heightfield/mod.rs); macro relief uses config.seed ^ 0x25d3_8a91_501c_f4d7 (heightfield/mod.rs). The relaxed_seed_points helper in stage 6 uses 0x7b58_7f5a_9915_2089 for provinces and a per-province 0xa28d_6c2f_3e15_9c01 ^ province.numeric_id as u64 for Locations (political.rs seed-point relaxation calls).

RNG sources. Two: splitmix64 (a fixed-constant 64-bit hash defined identically in noise.rs and political.rs) for hashing seeds and lattice coordinates, and halton(index, base) quasi-random sequences for blue-noise candidate placement (political.rs). No OS RNG, no rand crate, no floating-point clock reads.

Parallelism and ordering. The only intra-stage parallelism is in stage 3 erosion. Each iteration uses par_chunks_mut(width).enumerate() — every parallel task writes to a disjoint output row from the immutable read-buffer view, so the parallel reduction is associative-by-construction, not order- dependent. Stage 6's neighbor graph collects edges into a BTreeSet<(String, String, String)> (political.rs) so the final Vec<NeighborEdge> ordering is the alphabetical ordering of the relation/A/B triples regardless of how the scan-order walk discovered them.

Why byte-identical. Every value written to a product is a f32/f64/u8/u32 derived deterministically from the (seed, config, position) triple. No timestamps appear in any product (the generation timestamp lives only in manifest.json, written by write_run_envelope outside build_products).

Determinism test. same_config_produces_same_core_product_bytes at pipeline.rs builds two runs with identical GeneratorConfig::default() into separate temp directories and asserts that the on-disk byte length and content match exactly for each of: config, height, slope, normal, sediment, flowAccumulation, valenarWorld, valenarWorldMesh.

Potential break points. The determinism test asserts byte identity within a single in-process build on a single machine; it does not assert byte identity across CPU architectures or Rust toolchain versions, so a binary built with a different LLVM floating-point fast-math setting could in principle produce different f32 cells. Inside add_raster_adjacency_edges and add_adjacency_edge (political.rs) the lookup uses HashMap<u32, String>; this is currently safe because the iterator that consumes the graph reads BTreeSet edges (not the HashMap itself) and the HashMap is only used for keyed lookup, but a future code change that consumed ids.iter() for output ordering would silently break determinism.

Limitations

The generator works end to end for the four scale presets and emits schema-valid products. The list below records substantive simplifying choices that limit what can be inferred from the outputs. Every item is technical debt.

  1. River network is hand-authored. Stage 4 emits 7 fixed river nodes, 5 fixed edges, and straight-segment centerlines from world-fraction positions and ridge-graph node indices (water.rs). Stage 3 emits flowAccumulation.f32.bin as a diagnostic raster, but no stage consumes it. Connecting the actual residual water column from the pipe-model erosion to a D8 watershed router and emitting an inferred river graph is the natural next step but it is not yet wired.
  2. Basin positions are hardcoded fractions. The two basins are placed at (W*0.29, H*0.63) and (W*0.68, H*0.72) (geography_graph.rs); the seed has no effect on basin placement, only on the noise that fills around them. A real basin discovery pass would scan the eroded heightfield for closed depressions.
  3. Continent outline is parametric, not noise- or tectonic-driven. Stage 1 uses a single ellipse plus three sinusoidal lobe harmonics modulated by sin(seed as f64) (continent.rs). The outline barely changes between seeds. No noise displacement, no plate-boundary simulation, no archipelago support beyond the single closed land polygon.
  4. Voronoi clipping is O(N^2 * V) per stage 6 cell. The half-plane clipper at political.rs walks every other seed for every cell and re-clips the cell ring against each bisector. For the continent preset (~7 000 Locations across ~1 000 provinces) this is the dominant cost of stage 6 by a wide margin. A Fortune sweep-line pass (Fortune 1986) is the canonical replacement.
  5. Lloyd relaxation runs once. Stages 6 calls relaxed_seed_points with iterations = 1 (political.rs seed-point relaxation calls). One Lloyd pass leaves visible anisotropy in cell shapes; production-quality Voronoi atlases typically run 3-8.
  6. No domain warping in the noise stack. Sample positions feed straight into gradient_noise_2d (noise.rs). No noise(x + n(x,y), y + n(x,y)) substitution that would break the octave-aligned grid look.
  7. No ridged-multifractal. rotated_gradient_fbm sums signed octaves; there is no abs() step that would create the sharp ridge-line look characteristic of ridged terrain. The "ridges" in mapv10 come from the explicit ridge graph in stage 2, not from a ridged-noise step.
  8. No mesh simplification or LOD-adaptive geometry. Terrain meshes are uniform grids per tile at the physical LOD's sample_step (tile_pyramid.rs). There is no quadtree subdivision, no error-driven simplification (Quadric Error Metrics, Garland- Heckbert 1997), no chunked LOD with stitched borders.
  9. No T-junction resolution between adjacent LOD tiles. The stage 11 BORDER_CELLS = 1 skirt resolves bilinear texture- sampling seams (tile_pyramid.rs) but it does not address geometry skirts at LOD boundaries. The terrain mesh at one z rung next to the next will show T-junctions; production engines emit explicit skirt geometry or stitch quads.
  10. Map features and label anchors are deterministic but not placement-quality-aware. Stage 8 emits one anchor per entity with zoom-band metadata, but it does not resolve label collision, along-spline label placement, or anchor displacement around rendered geometry — that work lives in the renderer and the runtime label system, not the generator.

Source / Symbol Map

TopicFileSymbol / section
CLI entry, output-path resolutiongenerator/src/main.rsmain, parse_args, prepare_output
GeneratorConfig, scale-preset tablegenerator/src/config.rsGeneratorConfig, ScalePreset, scale_preset_config
Pipeline orchestratorgenerator/src/pipeline.rsbuild_products
StageWrite + add_stagegenerator/src/pipeline.rsStageWrite, add_stage
Determinism testgenerator/src/pipeline.rssame_config_produces_same_core_product_bytes
Continent outlinegenerator/src/stages/continent.rsgenerate_continent and helpers
Ridge graph + basinsgenerator/src/stages/geography_graph.rsgenerate_geography_graph and basin helpers
Heightfield noise stackgenerator/src/stages/heightfield/mod.rsoctave tables and generate_heightfield
Erosion physicsgenerator/src/stages/heightfield/erosion.rsErosionParams, erode_heightfield
Watergenerator/src/stages/water.rsgenerate_water
Biomes / materialsgenerator/src/stages/biomes_materials.rsecology octave table and generate_biomes_materials
Political (Voronoi + adjacency)generator/src/stages/political.rsseed relaxation, Voronoi clipping, adjacency builders
Procedural namer (per-biome banks, grammar, disambiguation)generator/src/stages/naming.rsNamingContext, BiomeNameTable, LocationNamer, ProvinceNamer, RealmNamer, compose_name
Political naming stage (biome sampling, name fill)generator/src/stages/political_naming.rsrun, sample_biome_at
Routesgenerator/src/stages/routes.rsgenerate_routes
Map features and labelsgenerator/src/stages/map_features.rsgenerate_map_features, generate_labels
Borders 1+JFAgenerator/src/stages/borders_sdf.rsgenerate_border_sdf, JFA passes
Tile pyramid + Toksviggenerator/src/stages/tile_pyramid.rstile constants, tile slicing, Toksvig prefilter
Meshes (originKm, m to km)generator/src/stages/meshes.rsmesh manifest structs, vertex write path, route ribbon build
Readiness budgetsgenerator/src/stages/readiness.rsreadiness budget constants and report builder
Previewsgenerator/src/stages/previews.rspreview writer dispatch
Valenar exportgenerator/src/stages/valenar_worlddata.rsfile-name helpers and generate projection
Noise primitivegenerator/src/noise.rsrotated_gradient_fbm, splitmix64