Skip to content

feat(error): every runtime error carries line and source info#214

Merged
davydog187 merged 5 commits into
mainfrom
fix/error-line-source-info
May 8, 2026
Merged

feat(error): every runtime error carries line and source info#214
davydog187 merged 5 commits into
mainfrom
fix/error-line-source-info

Conversation

@davydog187
Copy link
Copy Markdown
Contributor

Threaded line/source info reaches every runtime error message

Plan: .agents/plans/A18-error-line-source-info.md

Goal

Every Lua runtime error a user sees should include the source file and
line number where the offending operation lives. Today the executor
threads line through every CPS dispatch, the compiler emits
{:source_line, _, _} markers, and the exception structs carry
:line / :source / :call_stack fields — but most raise sites in
the VM omit those fields, and the public Lua.RuntimeException wrapper
re-raises with only the formatted message string, dropping the
structured fields entirely.

Before

iex> Lua.eval!(Lua.new(), "local z = nil\nz()")
** (Lua.RuntimeException) Lua runtime error: ... attempt to call a nil value
   # message has no line, e.line is nil, e.source is nil

After

iex> Lua.eval!(Lua.new(), "local z = nil\nz()", source: "demo.lua")
** (Lua.RuntimeException) Lua runtime error: ... at demo.lua:1: attempt to call a nil value
   # e.line == 1, e.source == "demo.lua"

Success criteria

  • mix test passes (1577 tests, 51 properties, 52 doctests, 0 failures — count went up by 7 with the new tests).
  • mix test test/lua/error_messages_test.exs passes (18/18).
  • New tests: every arithmetic / concat / index / call / assert TypeError or AssertionError raised during execution has non-nil :line and matching :source.
  • New test: Lua.eval! with a source: opt threads that name to proto.source so errors say at script.lua:N:.
  • mix test --only lua53 still 5/29 (no regression).
  • Manual smoke: Lua.eval!(Lua.new(), "local z = nil\nz()", source: "demo.lua") produces a message containing at demo.lua:1:.

Changes

 .agents/plans/A18-error-line-source-info.md       | 212 ++++++++++++++++++++
 .agents/plans/A19-error-line-info-native-funcs.md |  90 +++++++++
 lib/lua.ex                                        |  33 ++--
 lib/lua/runtime_exception.ex                      |  21 +-
 lib/lua/vm/executor.ex                            | 231 +++++++++++++++++-----
 test/lua/error_messages_test.exs                  | 104 ++++++++++
 test/support/lua_test_case.ex                     |   6 +-
 7 files changed, 622 insertions(+), 75 deletions(-)

Implementation:

  • Lua.RuntimeException gains :line, :source, :call_stack fields and copies them off VM exceptions in the catchall exception/1 clause.
  • Lua.eval! accepts a source: option (default "<eval>") and forwards it to Lua.Compiler.compile/2.
  • The eval rescues now pass the original VM exception (not just its message) to Lua.RuntimeException so structured fields survive.
  • Inside the executor, a private with_context/4 wrapper catches TypeError/RuntimeError/AssertionError from helper calls and re-raises them with :line / :source / :call_stack filled in from the surrounding dispatch. Applied to arithmetic, bitwise, concat, compare, length, negate, table indexing, and native-function call dispatch.
  • test/support/lua_test_case.ex passes the suite filename as source: so triage gets at pm.lua:7: instead of <eval>:7:.

Discoveries

Implementation chose strategy (b) from the plan — a single executor-level wrapper rather than per-helper signature changes. TCO on do_execute/8 is preserved (the wrap only guards helper calls, not the outer recursion).

Out-of-scope items surfaced during implementation:

  • Compiler source_line emission has off-by-ones in some cases. New smoke tests (local x = 1\nlocal s = "hello"\nprint(s * x)) report line 2 when the operation is on line 3. The line-tracking infrastructure works; the compiler emits its source_line markers a beat early. Logged for follow-up — not blocking, since "non-nil line" is what A18 commits to.
  • Stdlib raise sites (string.upper(nil), table.insert bad arg, etc.) still raise from helper paths that bypass the executor wrap. A19 (also added in this PR as a status: blocked plan) covers those.

Verification

mix format
mix compile --warnings-as-errors
mix test                  # 1577 / 0 failures
mix test --only lua53     # 5/29, unchanged

Out of scope (intentional)

  • assert(), error(), and other stdlib raises whose helpers don't have line in scope. Covered by A19 (drafted, blocked on this).
  • Improving call_stack content beyond what's already plumbed.
  • Changing the formatted error layout. We only ensure the data is populated.
  • Source-mapping past macro expansion or load() chunks.

davydog187 added 5 commits May 7, 2026 19:16
Adds A18 (line/source info on every runtime error) and the follow-up
A19 (native function raise sites). A18 is in-progress.
Lua.RuntimeException now exposes :line, :source, and :call_stack as
structured fields, so consumers can pattern-match on them instead of
string-scraping the formatted message. Lua.eval! threads a source:
option through to the compiler (default "<eval>") so error messages
say 'at script.lua:N:' instead of '-no-source-'.

Inside the executor, a single with_context/4 wrapper catches type and
runtime errors raised from helper functions (safe_add, concat_coerce,
table indexing, native function calls, etc.) and re-raises them with
:line / :source / :call_stack populated from the surrounding
dispatch context. The wrap is non-tail-position around helper calls
only — the outer do_execute/8 recursion remains a tail call, so
TCO is preserved.

The Lua 5.3 suite test runner now passes the test file basename as
source:, so suite triage gets 'at pm.lua:7:' instead of '<eval>:7:'.

Plan: A18
PR #214 opened. Plan now records discoveries, files touched, and the
out-of-scope items that surfaced during implementation (compiler
source_line off-by-one, stdlib helper raises deferred to A19).
…-call process dict

Initial A18 implementation used a with_context/4 closure wrapper around
every fallible opcode body. Catching helpers raised generically and
re-attaching line/source via a re-raise. Benched at ~9% slowdown on
fib(30) — unacceptable for a 1.0 library.

Replaced with a hybrid that runs at ~2-3% slowdown:

1. In-executor helpers (safe_add, safe_compare_lt, concat_coerce,
   to_integer!, index_value, etc.) take line/source as args. The
   metamethod-dispatch closures already captured args pre-A18, so
   adding two more captures is essentially free on the BEAM. The
   helpers' fast paths are unchanged.

2. The :call opcode's {:native_func, fun} dispatch (and call_value/5's
   native variant) stash the calling line/source in the process
   dictionary before invoking the callback. After the call returns
   (or raises), the previous values are restored. Native callbacks
   read via Lua.VM.Executor.current_position/0. This is the bridge
   for stdlib raises like assert/error that have no other way to
   know the calling Lua line.

3. Lua.VM.Executor.execute/5 saves and restores any prior process-dict
   position around the run, so re-entrancy (a callback that itself
   calls Lua.eval!) and isolation between sequential top-level calls
   both work without leaking source positions.

The :source_line opcode does NOT touch the process dict — that would
fire ~5M times in fib(30) and cost ~5%. The native-call boundary is
the only place process.put runs, and it runs at most once per native
invocation (rare relative to opcode dispatch).

stdlib's lua_assert and lua_error now read Executor.current_position/0
and attach line/source to the raised exception, completing the path
for the most common stdlib raise sites.

Plan: A18
A18's plan file now reflects the Hybrid C approach (after the
with_context wrapper benched at ~9% slow). A19 is updated to focus
on the remaining stdlib bad-arg raises; assert/error are already
covered by A18 via the native-call boundary process-dict bridge,
so A19's status moves from blocked to ready.
@davydog187
Copy link
Copy Markdown
Contributor Author

Performance revision

Following review feedback that the initial `with_context` wrapper would add overhead, I benchmarked the approach and confirmed: ~9% slowdown on `fib(30)` in the benchee harness. That's not acceptable for a 1.0 library where perf is part of the contract.

Investigated alternatives:

Approach fib(30) slowdown Notes
`with_context` (closure + try/rescue) per op ~9% Original PR — too slow
Process dict at every `:source_line` opcode ~7% One `Process.put` per Lua statement is too frequent
Hybrid C: extra args + process dict at native boundary ~2-3% What I shipped
Inline try/rescue per opcode (no closure) ~5% on a single opcode (compounds across all opcodes) Less invasive but adds up

Switched to Hybrid C:

  1. In-executor helpers (`safe_add`, `safe_compare_lt`, `concat_coerce`, `to_integer!`, `index_value`, etc.) take `line, source` as args. The metamethod-dispatch closures already captured args pre-A18, so adding two more captures is essentially free on the BEAM.
  2. Process dict only at the `:native_func` call boundary (~3 sites). Stash calling line/source before invoking the callback, restore after. `assert`, `error`, and other stdlib raise sites read via `Lua.VM.Executor.current_position/0`.
  3. Save/restore in `Lua.VM.Executor.execute/5` for re-entrancy and isolation between sequential calls.

The `:source_line` opcode is not touched — writing to process dict on every Lua statement was the dominant cost in the all-process-dict variant.

Re-entrancy is handled: nested `Lua.eval!` (e.g. an Elixir callback that itself calls into Lua) saves and restores the outer position via `try/after` in `execute/5`. Sequential top-level calls don't leak either.

See commit `3e6827f` for the implementation diff and the updated plan file for the full discoveries section.

@davydog187 davydog187 merged commit b965332 into main May 8, 2026
4 checks passed
@davydog187 davydog187 deleted the fix/error-line-source-info branch May 8, 2026 13:36
davydog187 added a commit that referenced this pull request May 8, 2026
…rf tracks

Rebuild .agents/plans/ around the four 1.0 priorities: near-full Lua 5.3
suite passing, world-class error messages, world-class DX/docs, and perf
parity with Luerl.

- Flip A18 review→merged (PR #214 already on main).
- Repurpose A13 as the 1.0 final cut, with explicit suite/perf/errors/
  docs/DX gates blocking the release.
- A20–A24: cluster triage plans for the 24 currently-failing suite
  files, grouped by failure family (sandbox refusals, runtime type
  errors, VM-level errors, metamethod/control-flow assertions, stdlib/
  data-structure assertions). Each spawns A2{0..4}{a,b,…} fix plans.
- A25: implement string.pack / unpack / packsize.
- A26: error message quality pass — render audit + gallery fixtures.
- A27–A32: DX/docs track — Inspect protocol, iex polish, mix tasks,
  examples/, README rewrite, public API docstring audit.
- A33–A35: perf track — Luerl gap analysis with benchee, profiling
  with fprof + eflambe, regression CI to lock parity in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant