vault admin_change can corrupt records if finalize fails mid-rotation

## Problem

`VaultBackendImpl::admin_change` (`crates/mqdb-vault/src/backend.rs:540-648`) rotates the passphrase in three writes:

1. `migration_start` — install new salt + new check, plus old check/salt as resume markers, set `vault_migration_status: pending, vault_migration_mode: re_encrypt`.
2. `batch_vault_re_encrypt` — decrypt each record with `old_crypto`, re-encrypt with `new_crypto`, write back.
3. `migration_done` — clear `vault_migration_status`, `vault_migration_mode`, `vault_old_check`, `vault_old_salt`.

After PR #65 surfaces step-3 failures, a real corruption path becomes visible: if step 2 succeeds and step 3 fails, the identity is left with `status: pending, mode: re_encrypt`, both checks present, and records encrypted under the **new** passphrase.

On the user's next unlock with the new passphrase, `resume_pending_migration` (`crates/mqdb-vault/src/ops.rs:122-196`) fires:

```rust
let old_crypto = VaultCrypto::derive(passphrase, &old_salt);  // passphrase is now the NEW one
batch_vault_re_encrypt(db, ownership, id, &old_crypto, crypto).await;
```

`old_crypto` here is derived from the **new** passphrase plus the **old** salt — it cannot decrypt records that are already encrypted with `new_crypto`. The decrypt step produces garbage, then encrypts that garbage with `new_crypto` and writes it back. **Data corruption.**

`admin_disable` has the same structural risk: if the post-batch identity update (`backend.rs:519-526`) fails after `batch_vault_operation` has decrypted records, identity still reports `vault_enabled: true` while records are plaintext. Next unlock will derive a key, `verify_check_token` succeeds against the stale `vault_check`, and reads will treat plaintext as ciphertext.

## Reproduction

Inject a failure on the `migration_done` `update_entity` call (e.g., DB transient error or `Database::update` returning `Err(Internal)`) after `batch_vault_re_encrypt` completes. The user then unlocks normally with the new passphrase. Verify subsequent reads of owned records return garbage.

## Proposed fix

The resume path assumes the passphrase passed to it can decrypt records under \`old_salt\`. That is only true while \`status == pending\` *and* the user is supplying the old passphrase. Possible approaches:

1. **Atomic finalize**: write \`migration_done\` *before* the last record is re-encrypted is unsafe; instead, make migration_done idempotent and retry it under a bounded loop before returning success. If retry exhausts, return Internal and leave the records re-encrypted but the markers stale — the next unlock with the **new** passphrase should then *detect* this state (records re-encrypted, status pending) and just clear the markers, not run the resume.
2. **Detect-and-skip**: on resume, attempt to decrypt a probe record with \`old_crypto\` before calling \`batch_vault_re_encrypt\`. If decrypt fails the integrity check, skip the resume and clear markers.
3. **Two-phase commit on the identity**: split the markers so \`migration_done\` only clears \`vault_migration_status\`, but \`vault_old_check\`/\`vault_old_salt\` are cleared in a separate idempotent step. The resume path can then use \`vault_old_check\` to *verify* that the supplied passphrase still matches the old salt before running re-encrypt.

Approach (2) is the cheapest and probably correct: \`VaultCrypto::verify_check_token\` on \`vault_old_check\` with the supplied passphrase + old_salt would tell us whether the resume is legitimate. If not, we're in the half-finalized state — just clear the markers.

## Scope

- Identified during review of PR #65 (https://github.com/LabOverWire/MQDB/pull/65).
- Predates #65; this PR only made the failure observable (was previously silent).
- Same shape exists in \`admin_disable\` finalize step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vault admin_change can corrupt records if finalize fails mid-rotation #66

Problem

Reproduction

Proposed fix

Scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

vault admin_change can corrupt records if finalize fails mid-rotation #66

Description

Problem

Reproduction

Proposed fix

Scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions