Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions docs/advanced/snapshot-restore.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ Typical uses:

The same entry points serve two storage modes — in-memory for fast
intra-run stash, and on-disk for persistent restart / postprocessing.
You pick by giving (or not giving) a ``file=``. The existing
``mesh.write_timestep()`` / ``read_timestep()`` and
``mesh.write_checkpoint()`` / ``MeshVariable.read_checkpoint()``
paths are unchanged and continue to serve their existing roles (see
You pick by giving (or not giving) a ``file=``. Mesh-variable output uses
``mesh.write_timestep()`` with explicit payload flags; reload uses either
``MeshVariable.read_timestep()`` for coordinate/KDTree remap or
``MeshVariable.read_checkpoint()`` for PETSc-native reload (see
"Choosing between paths" at the bottom).

## The API
Expand Down Expand Up @@ -245,8 +245,8 @@ classes. For a Python-side summary use
|---|---|
| Backtrack a few steps inside a running script (RK staging, adaptive Δt, predictor–corrector probes) | ``save_state()`` → token |
| Persist whole-model state across runs (crash recovery, bisection studies, full restart) | ``save_state(file=…)`` / ``load_state(file=…)`` |
| Restart from a previous run on a *different* rank count or remap onto a *different* resolution | ``mesh.write_timestep()`` / ``MeshVariable.read_timestep()`` (KDTree remap) |
| Efficient same-rank restart writing only specific variables for postprocessing | ``mesh.write_checkpoint()`` / ``MeshVariable.read_checkpoint()`` (PETSc DMPlex per-variable) |
| Restart from a previous run on a *different* rank count or remap onto a *different* resolution | ``mesh.write_timestep(...)`` / ``MeshVariable.read_timestep()`` (coordinate/KDTree remap) |
| Efficient same-rank restart writing only specific variables for postprocessing | ``mesh.write_timestep(petsc_reload=True)`` / ``MeshVariable.read_checkpoint()`` (PETSc DMPlex per-variable) |
| Visualisation for ParaView (XDMF + per-step HDF5) | ``mesh.write_timestep(create_xdmf=True)`` |

## Related
Expand Down
61 changes: 30 additions & 31 deletions docs/developer/design/in_memory_checkpoint_design.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,33 +49,33 @@ benefits:
paths, which surfaces gaps and bugs that disk-only checkpointing
exposes only at quarterly-test cadence.

## Two checkpoint paths today (audit-confirmed)

The audit (read-only investigation, current `main`) confirmed two
distinct on-disk paths in `src/underworld3/`:

**Path A — PETScSection-based (native checkpoint).**
- `Mesh.write_checkpoint(...)` at `discretisation/discretisation_mesh.py:1892–1953`.
- Uses `dm.sectionView(viewer, subdm)` and `dm.globalVectorView(viewer, subdm, var._gvec)`
through a `PETSc.ViewerHDF5()`.
- Captures mesh topology, deformed coordinates, MV DOF values, and
swarm-variable values via the `_meshVar` proxy.
- Lower-level; bound to the producing DM.

**Path B — write_timestep (visualization-oriented).**
- `Mesh.write_timestep(...)` at `discretisation/discretisation_mesh.py:1750–1830`
and `Swarm.write_timestep(...)` at `swarm.py:3726–3810`.
- Per-variable HDF5 + XDMF for ParaView; mixes `PETSc.ViewerHDF5()`
with direct `h5py` writes (`swarm.py:1772–1850`).
- More flexible — can be re-loaded at different resolution /
decomposition; bulkier; the user-facing visualisation pipeline.

**For in-memory snapshot, Path A is the conceptual model.** Restore
goes back to the same DM, so the resolution/decomposition flexibility
of Path B is unneeded. But the in-memory backend will not actually use
the HDF5 Viewer — it will copy section structure + global-vector data
directly into numpy arrays. Same conceptual capture, different
mechanism.
## Two mesh-variable output paths

This design note predates the unified timestep writer. The current public API
keeps the same two reload semantics, but exposes new output through
`Mesh.write_timestep(...)`:

**Coordinate-remap path.**
- Write with `Mesh.write_timestep(...)`.
- Read selected variables with `MeshVariable.read_timestep(...)`.
- Uses `/fields` coordinate/value datasets and coordinate/KDTree remapping.
- Can load data onto a different mesh or MPI decomposition.

**PETSc-native reload path.**
- Write with `Mesh.write_timestep(..., petsc_reload=True)`.
- Read selected variables with `MeshVariable.read_checkpoint(...)`.
- Uses PETSc DMPlex topology, section, vector, and `PetscSF` metadata.
- Intended for exact same-mesh finite-element vector reload.

`Mesh.write_checkpoint(...)` is retained as a compatibility wrapper for older
scripts. New code should use `Mesh.write_timestep(..., petsc_reload=True)` for
PETSc-native reload output.

**For in-memory snapshot, the PETSc-native path is the conceptual model.**
Restore goes back to the same DM, so the resolution/decomposition flexibility
of coordinate remap is unneeded. But the in-memory backend does not use the
HDF5 Viewer — it copies section structure and vector data directly into numpy
arrays. Same conceptual capture, different mechanism.

## What state must be captured

Expand Down Expand Up @@ -352,7 +352,7 @@ section.
In rough dependency order:

1. **Backend abstraction layer.** Extract the "capture" logic from the
"store" logic in `Mesh.write_checkpoint()`. Introduce a
output-storage logic used for PETSc-native reload. Introduce a
`CheckpointBackend` protocol (e.g., `save_section`, `save_vector`,
`save_metadata`, `load_section`, `load_vector`, `load_metadata`).
Shaped from day one to support both backends — the in-memory case
Expand All @@ -365,9 +365,8 @@ In rough dependency order:
- `InMemoryBackend` — dict of numpy arrays. Trivial once the
abstraction exists.
- `OnDiskFullStateBackend` — single monolithic HDF5 file. Shares
PETSc-state serialisation with the existing `write_checkpoint`
path (already HDF5); adds Python-state serialisation as HDF5
attributes/groups.
PETSc-state serialisation with the existing PETSc-native reload path
(already HDF5); adds Python-state serialisation as HDF5 attributes/groups.
3. **Adopt the serialisation contract for new solver-internal code.**
Decision: option (C) — state as first-class dataclass (see "General
serialisation contract" section). Any new algorithm-helper class
Expand Down
85 changes: 49 additions & 36 deletions docs/developer/design/petsc-dmplex-checkpoint-reload-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,30 @@ the failure mode is already known.

## Objective

Provide an exact PETSc DMPlex checkpoint reload path for UW3 mesh variables.
This path is intended for restart and large-scale postprocessing, not
visualisation.
Provide an exact PETSc DMPlex reload path for UW3 mesh variables. This path is
intended for restart and large-scale postprocessing. It is now exposed through
the standard `write_timestep(..., petsc_reload=True)` output method; legacy
`write_checkpoint()` calls remain supported as a compatibility wrapper.

Target workflow:

```python
mesh.write_checkpoint(
mesh.write_timestep(
"checkout",
index=0,
outputPath=str(output_dir),
meshVars=[v_soln, p_soln],
index=0,
create_xdmf=False,
petsc_reload=True,
)
```

Default output:

```text
checkout.mesh.00000.h5
checkout.Velocity.00000.h5
checkout.Pressure.00000.h5
checkout.mesh.Velocity.00000.h5
checkout.mesh.Pressure.00000.h5
```

Reload workflow:
Expand All @@ -45,46 +48,58 @@ mesh = uw.discretisation.Mesh("checkout.mesh.00000.h5")
v_soln = uw.discretisation.MeshVariable("Velocity", mesh, mesh.dim, degree=2)
p_soln = uw.discretisation.MeshVariable("Pressure", mesh, 1, degree=1)

v_soln.read_checkpoint("checkout.Velocity.00000.h5", data_name="Velocity")
p_soln.read_checkpoint("checkout.Pressure.00000.h5", data_name="Pressure")
v_soln.read_checkpoint("checkout.mesh.Velocity.00000.h5", data_name="Velocity")
p_soln.read_checkpoint("checkout.mesh.Pressure.00000.h5", data_name="Pressure")
```

The reload path must not use `KDTree` remapping. It must restore FE data through
PETSc DMPlex topology, section, vector, and `PetscSF` metadata.

## Existing Output Methods
## Output Payloads

UW3 has two related but different output paths.
`mesh.write_timestep(...)` is the standard output method. It can write either
or both output payloads.

| Method | Purpose | Reload method | Strength | Limitation |
| Payload | Writer option | Reload method | Strength | Limitation |
| --- | --- | --- | --- | --- |
| `mesh.write_timestep(...)` | Visualisation and flexible field remap | `MeshVariable.read_timestep(...)` | Writes XDMF and vertex-field data; can map data onto a different mesh | Uses coordinate/KDTree remapping; memory-heavy for large meshes and high MPI counts |
| `mesh.write_checkpoint(...)` | Restart and exact postprocessing | `MeshVariable.read_checkpoint(...)` | Uses PETSc DMPlex section/vector metadata; avoids KDTree | Not a visualisation output; no XDMF or vertex-field datasets |
| XDMF/remap | `create_xdmf=True` | `MeshVariable.read_timestep(...)` | Writes XDMF and vertex-field data; can map data onto a different mesh | Uses coordinate/KDTree remapping; memory-heavy for large meshes and high MPI counts |
| PETSc reload | `petsc_reload=True` | `MeshVariable.read_checkpoint(...)` | Uses PETSc DMPlex section/vector metadata; avoids KDTree | Requires compatible PETSc mesh metadata |

The benchmark scripts should use `write_timestep()` when they need
visualisation files, and `write_checkpoint()` when they need restart-safe,
memory-efficient postprocessing.
For unified output, use `write_timestep(..., create_xdmf=True,
petsc_reload=True)`. For restart-safe, memory-efficient postprocessing without
XDMF, use `write_timestep(..., create_xdmf=False, petsc_reload=True)`.

## Implemented Design

### Checkpoint Writing
### PETSc Reload Writing

`Mesh.write_checkpoint(...)` writes PETSc DMPlex HDF5 storage version `3.0.0`.
`Mesh.write_timestep(..., petsc_reload=True)` writes PETSc DMPlex reload
metadata into the standard per-variable timestep HDF5 files. The legacy
`Mesh.write_checkpoint(...)` compatibility wrapper also writes PETSc DMPlex
reload metadata, but uses the older checkpoint filename layout.

The mesh file is named:

```text
<base>.mesh.<index>.h5
```

With the default `separate_variable_files=True`, each mesh variable is written
to its own checkpoint file:
With `write_timestep(..., petsc_reload=True)`, each mesh variable is written to
its own timestep-layout file:

```text
<base>.mesh.<variable>.<index>.h5
```

The legacy `write_checkpoint(...)` wrapper writes one file per variable by
default:

```text
<base>.<variable>.<index>.h5
```

With `separate_variable_files=False`, all variables are written into:
With the legacy `write_checkpoint(..., separate_variable_files=False)` mode,
all variables are written into:

```text
<base>.checkpoint.<index>.h5
Expand Down Expand Up @@ -174,11 +189,11 @@ mpirun -np 2 ./uw python -m pytest \
## Spherical Benchmark Validation

The motivating case is spherical benchmark postprocessing at high MPI counts.
The old `write_timestep()` / `read_timestep()` path can build large KDTree
mapping structures during reload. At `1/128` this used nearly the full 4.5 TB
The coordinate-remap `read_timestep()` path can build large KDTree mapping
structures during reload. At `1/128` this used nearly the full 4.5 TB
allocation on Gadi.

The checkpoint method avoids KDTree reload and preserves velocity/pressure
The PETSc reload method avoids KDTree reload and preserves velocity/pressure
metrics to roundoff. Boundary stress metrics require the benchmark to recover
stress consistently after reload. In the spherical benchmark this is handled by
projecting the six deviatoric-stress components and then forming `sigma_rr`.
Expand All @@ -187,10 +202,10 @@ projecting the six deviatoric-stress components and then forming `sigma_rr`.

| Resolution | Method | NCPUs | Walltime | Memory used | Status |
| --- | --- | ---: | ---: | ---: | --- |
| `1/64` | `write_timestep/read_timestep` | 144 | `00:03:43` | `211.27 GB` | completed |
| `1/64` | `write_checkpoint/read_checkpoint` | 144 | `00:02:41` | `233.67 GB` | completed |
| `1/128` | `write_timestep/read_timestep` | 1152 | `00:13:55` | `3.92 TB` | completed near memory limit |
| `1/128` | `write_checkpoint/read_checkpoint` | 1152 | `00:03:57` | `1.83 TB` | completed |
| `1/64` | `write_timestep/read_timestep` remap | 144 | `00:03:43` | `211.27 GB` | completed |
| `1/64` | `write_timestep(petsc_reload=True)/read_checkpoint` | 144 | `00:02:41` | `233.67 GB` | completed |
| `1/128` | `write_timestep/read_timestep` remap | 1152 | `00:13:55` | `3.92 TB` | completed near memory limit |
| `1/128` | `write_timestep(petsc_reload=True)/read_checkpoint` | 1152 | `00:03:57` | `1.83 TB` | completed |

The `1/128` checkpoint reload reduced memory by about `2.09 TB` and walltime by
about `3.5x` for the postprocessing run.
Expand All @@ -199,7 +214,7 @@ about `3.5x` for the postprocessing run.

`1/128` spherical Thieulot benchmark:

| Metric | `write_timestep/read_timestep` | `write_checkpoint/read_checkpoint` |
| Metric | `write_timestep/read_timestep` remap | `write_timestep(petsc_reload=True)/read_checkpoint` |
| --- | ---: | ---: |
| `v_l2_norm` | `1.4319274480265082e-06` | `1.4319274480231255e-06` |
| `p_l2_norm` | `5.985841567394967e-04` | `5.985841567395382e-04` |
Expand All @@ -216,7 +231,7 @@ projection after checkpoint reload.

`1/64` spherical Thieulot benchmark:

| Metric | `write_timestep/read_timestep` | `write_checkpoint/read_checkpoint` |
| Metric | `write_timestep/read_timestep` remap | `write_timestep(petsc_reload=True)/read_checkpoint` |
| --- | ---: | ---: |
| `v_l2_norm` | `1.1662200663950889e-05` | `1.1662200663957042e-05` |
| `p_l2_norm` | `2.7573367818459473e-03` | `2.7573367818460497e-03` |
Expand All @@ -236,11 +251,9 @@ projection after checkpoint reload.

The checkpoint reload implementation is ready for review when:

- `write_checkpoint()` writes PETSc DMPlex HDF5 storage version `3.0.0`.
- `write_checkpoint()` supports `outputPath`.
- `write_checkpoint()` defaults to one checkpoint file per variable.
- `write_checkpoint(..., separate_variable_files=False)` still supports a
combined variable checkpoint file.
- `write_timestep(..., petsc_reload=True)` writes PETSc DMPlex reload metadata.
- `write_timestep(..., petsc_reload=True)` supports `outputPath`.
- legacy `write_checkpoint()` remains available as a compatibility wrapper.
- `MeshVariable.read_checkpoint(...)` reloads through PETSc metadata, not
coordinate/KDTree remapping.
- scalar, vector, and discontinuous variables roundtrip in tests.
Expand Down
Loading
Loading