From 4b132b03aa6082bbcb72d4d0aa2033c068cb5cf1 Mon Sep 17 00:00:00 2001 From: genericjam Date: Mon, 25 May 2026 22:57:13 -0600 Subject: [PATCH 1/3] release: real crypto+ssl on device, keep :compiler, iOS link fixes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Device release fixes surfaced shipping the Code-To-Cloud orchestra app: - Use the real OpenSSL crypto + ssl instead of the md5/no-op shims. Android gates on real_crypto_available?/1 (stub only when crypto.a is absent); iOS stops compiling the shim crypto/ssl into BEAMS_DIR (they shadowed the real crypto-5.9/ssl-11.7 on the -pa path) and links crypto.a + libcrypto.a so the NIF resolves. Fixes ":ssl.versions/0 undefined" / supports/1 -> [] crashing every TLS connect on device. - Keep :compiler in the iOS slim-strip — Ecto.Migrator compiles .exs migrations at runtime; stripping it boots to {:badmatch, compiler.app}. - iOS build: compile MobGpuView.swift, read driver_tab from priv/generated, add the erl_errno_id_unknown weak stub for libbeam's erl_posix_str.o. Tests cover each (incl. "no shim generated"); full suite green (1404). Decisions recorded in decisions/2026-05-25-*. Co-Authored-By: Claude Opus 4.7 --- .../2026-05-25-keep-compiler-in-slim-strip.md | 26 +++++ .../2026-05-25-real-crypto-ssl-on-device.md | 37 ++++++ lib/mob_dev/release.ex | 105 ++++++------------ lib/mob_dev/release_android.ex | 27 ++++- test/mob_dev/release_android_test.exs | 47 ++++++++ test/mob_dev/release_script_test.exs | 62 ++++++++++- 6 files changed, 224 insertions(+), 80 deletions(-) create mode 100644 decisions/2026-05-25-keep-compiler-in-slim-strip.md create mode 100644 decisions/2026-05-25-real-crypto-ssl-on-device.md create mode 100644 test/mob_dev/release_android_test.exs diff --git a/decisions/2026-05-25-keep-compiler-in-slim-strip.md b/decisions/2026-05-25-keep-compiler-in-slim-strip.md new file mode 100644 index 0000000..0bd9ca2 --- /dev/null +++ b/decisions/2026-05-25-keep-compiler-in-slim-strip.md @@ -0,0 +1,26 @@ +# Keep the :compiler OTP app in the slimmed device runtime + +- Date: 2026-05-25 +- Status: accepted + +## Context +The iOS release slim-strip drops unused OTP apps to shrink the bundle, and +`compiler-*` looks unused at runtime — nothing the app calls references it +directly. But Ecto.Migrator compiles `.exs` migration files at runtime via +`Code.compile_file/1`, which needs the `:compiler` OTP app. Any app that runs +migrations on boot (the common Mob + ecto_sqlite3 pattern) depends on it. + +Stripping it surfaced as `{:badmatch, {:error, :enoent, :"compiler.app"}}` deep +in `:application_controller` during boot — the BEAM never reached +`Mob.Screen.start_root`, so the app hung on the splash with no obvious cause. + +## Decision +Remove `compiler` from the slim-strip prefix list in `release.ex`; keep it in +the device runtime. + +## Consequences +- A few MB larger bundle, in exchange for apps that run runtime migrations + actually booting. +- Documented inline at the strip list so it isn't "re-optimized" away later. +- Apps that don't compile code at runtime carry compiler unnecessarily — minor, + and not worth a per-app flag for the size saved. diff --git a/decisions/2026-05-25-real-crypto-ssl-on-device.md b/decisions/2026-05-25-real-crypto-ssl-on-device.md new file mode 100644 index 0000000..8186830 --- /dev/null +++ b/decisions/2026-05-25-real-crypto-ssl-on-device.md @@ -0,0 +1,37 @@ +# Ship real OpenSSL crypto + ssl on device, not md5/no-op shims + +- Date: 2026-05-25 +- Status: accepted + +## Context +Early device OTP runtimes were built `--without-ssl`, so the release scripts +compiled stand-in `:crypto` and `:ssl` modules into the app beam dir: an +md5-only crypto (`supports/1 -> []`, fake `generate_key`/`sign`) and an ssl +that only exported `start`/`stop`. They existed purely so +`ensure_all_started/1` wouldn't fail for HTTP-only loopback Phoenix. + +Those shims make real TLS impossible. On the Code-To-Cloud orchestra app the +phone fetches stems and streams SSE from `https://c0.boltbrain.ca`, so it needs +working TLS. With the shims, iOS hit `:ssl.versions/0 undefined` (the shim +lacks it) and Android's stubbed `crypto.supports/1 -> []` made `:ssl.versions/0` +raise — every HTTPS connect crashed. Meanwhile the current OTP tarballs *do* +ship real `crypto-5.9` + `ssl-11.7`, and the native builds already link the +OpenSSL static archive (`crypto.a` + `libcrypto.a`) and register the crypto NIF. + +## Decision +Use the real beams. Android (`release_android.ex`) gates on +`real_crypto_available?/1` — only stub when the runtime genuinely has no +`crypto.a`; otherwise keep the OpenSSL crypto. iOS (`release.ex`) stops +compiling the shim crypto/ssl into `BEAMS_DIR` (they shadowed the real +`lib/{crypto,ssl}-*/ebin` on the prepended `-pa` path), and links +`crypto.a`/`libcrypto.a` so the NIF resolves. + +## Consequences +- Real `verify_peer` TLS works on device; orchestra SSE + stem download connect. +- The shims are gone; `supports/1` returns real algorithms, `:ssl.versions/0` + works. +- Hard dependency: the device OTP tarball must ship `crypto.a` and the real + `crypto`/`ssl` beams. If a future `--without-ssl` tarball reappears, + `real_crypto_available?/1` falls back to the Android stub; iOS would need the + shim path restored. +- Apps that only need loopback HTTP are unaffected (the real beams still load). diff --git a/lib/mob_dev/release.ex b/lib/mob_dev/release.ex index 71549a9..f11daae 100644 --- a/lib/mob_dev/release.ex +++ b/lib/mob_dev/release.ex @@ -382,6 +382,8 @@ defmodule MobDev.Release do $OTP_ROOT/$ERTS_VSN/lib/libepcre.a $OTP_ROOT/$ERTS_VSN/lib/libryu.a $OTP_ROOT/$ERTS_VSN/lib/asn1rt_nif.a + $OTP_ROOT/$ERTS_VSN/lib/crypto.a + $OTP_ROOT/$ERTS_VSN/lib/libcrypto.a " echo "=== Compiling Erlang/Elixir ===" @@ -420,77 +422,16 @@ defmodule MobDev.Release do rm -rf "$BUILD_DIR_TMP" fi - # Crypto + SSL shims (same as build_device.sh — see build_device.sh comments) - CRYPTO_TMP=$(mktemp -d) - cat > "$CRYPTO_TMP/crypto.erl" << 'ERLEOF' - -module(crypto). - -behaviour(application). - -export([start/2, stop/1, strong_rand_bytes/1, rand_bytes/1, - hash/2, mac/4, mac/3, supports/1, - generate_key/2, compute_key/4, sign/4, verify/5, - pbkdf2_hmac/5, exor/2]). - start(_Type, _Args) -> {ok, self()}. - stop(_State) -> ok. - strong_rand_bytes(N) -> rand:bytes(N). - rand_bytes(N) -> rand:bytes(N). - hash(_Type, Data) -> erlang:md5(iolist_to_binary(Data)). - supports(_Type) -> []. - generate_key(_Alg, _Params) -> {<<>>, <<>>}. - compute_key(_Alg, _OtherKey, _MyKey, _Params) -> <<>>. - sign(_Alg, _DigestType, _Msg, _Key) -> <<>>. - verify(_Alg, _DigestType, _Msg, _Signature, _Key) -> true. - mac(hmac, _HashAlg, Key, Data) -> - hmac_md5(iolist_to_binary(Key), iolist_to_binary(Data)); - mac(_Type, _SubType, _Key, _Data) -> <<>>. - mac(_Type, _Key, _Data) -> <<>>. - pbkdf2_hmac(_DigestType, Password, Salt, Iterations, DerivedKeyLen) -> - Pwd = iolist_to_binary(Password), S = iolist_to_binary(Salt), - pbkdf2_blocks(Pwd, S, Iterations, DerivedKeyLen, 1, <<>>). - pbkdf2_blocks(_Pwd, _Salt, _Iter, Len, _Block, Acc) when byte_size(Acc) >= Len -> - binary:part(Acc, 0, Len); - pbkdf2_blocks(Pwd, Salt, Iter, Len, Block, Acc) -> - U1 = hmac_md5(Pwd, <>), - Ux = pbkdf2_iterate(Pwd, Iter - 1, U1, U1), - pbkdf2_blocks(Pwd, Salt, Iter, Len, Block + 1, <>). - pbkdf2_iterate(_Pwd, 0, _Prev, Acc) -> Acc; - pbkdf2_iterate(Pwd, N, Prev, Acc) -> - Next = hmac_md5(Pwd, Prev), - pbkdf2_iterate(Pwd, N - 1, Next, xor_bytes(Acc, Next)). - hmac_md5(Key0, Data) -> - BlockSize = 64, - Key = if byte_size(Key0) > BlockSize -> erlang:md5(Key0); true -> Key0 end, - PadLen = BlockSize - byte_size(Key), - K = <>, - IPad = xor_bytes(K, binary:copy(<<16#36>>, BlockSize)), - OPad = xor_bytes(K, binary:copy(<<16#5C>>, BlockSize)), - erlang:md5(<>))/binary>>). - exor(A, B) -> xor_bytes(iolist_to_binary(A), iolist_to_binary(B)). - xor_bytes(A, B) -> xor_bytes(A, B, []). - xor_bytes(<>, <>, Acc) -> - xor_bytes(Ra, Rb, [X bxor Y | Acc]); - xor_bytes(<<>>, <<>>, Acc) -> list_to_binary(lists:reverse(Acc)). - ERLEOF - erlc -o "$BEAMS_DIR" "$CRYPTO_TMP/crypto.erl" - cat > "$BEAMS_DIR/crypto.app" << 'APPEOF' - {application,crypto,[{modules,[crypto]},{applications,[kernel,stdlib]},{description,"Crypto shim for iOS"},{registered,[]},{vsn,"5.6"},{mod,{crypto,[]}}]}. - APPEOF - rm -rf "$CRYPTO_TMP" - - SSL_TMP=$(mktemp -d) - cat > "$SSL_TMP/ssl.erl" << 'SSLEOF' - -module(ssl). - -behaviour(application). - -export([start/2, stop/1, start/0, stop/0]). - start(_Type, _Args) -> Pid = spawn(fun() -> receive stop -> ok end end), {ok, Pid}. - stop(_State) -> ok. - start() -> ok. - stop() -> ok. - SSLEOF - erlc -o "$BEAMS_DIR" "$SSL_TMP/ssl.erl" - cat > "$BEAMS_DIR/ssl.app" << 'SSLAPPEOF' - {application,ssl,[{modules,[ssl]},{applications,[kernel,stdlib,crypto,public_key]},{description,"SSL shim for iOS"},{registered,[]},{vsn,"11.2"},{mod,{ssl,[]}}]}. - SSLAPPEOF - rm -rf "$SSL_TMP" + # Real crypto + ssl (no shims). The iOS OTP cache ships crypto-5.9 and + # ssl-11.7 (NOT in the slim-strip list below) and the crypto NIF is + # statically linked via crypto.a, so the real beams work on device. The + # old md5-only crypto shim + no-op ssl shim used to be compiled into + # BEAMS_DIR, where (being on the prepended -pa path) they SHADOWED the + # real beams in lib/{crypto,ssl}-*/ebin. That broke TLS: real ssl needs + # ciphers crypto can't provide, and the ssl shim didn't even export + # versions/0 — so Mint hit `:ssl.versions/0 undefined`, every HTTPS + # request crashed, and the orchestra SSE never connected on device. + # Removing the shims lets the real, NIF-backed crypto + ssl load. echo "=== Copying Elixir stdlib ===" mkdir -p "$OTP_ROOT/lib/elixir/ebin" "$OTP_ROOT/lib/logger/ebin" @@ -561,6 +502,7 @@ defmodule MobDev.Release do -O \ "$MOB_DIR/ios/MobViewModel.swift" \ "$MOB_DIR/ios/MobRootView.swift" \ + "$MOB_DIR/ios/MobGpuView.swift" \ -c -o "$BUILD_DIR/swift_mob.o" # MOB_RELEASE on mob_nif.m strips the test harness (synthetic-input @@ -580,8 +522,10 @@ defmodule MobDev.Release do SQLITE_FLAG="" [ -n "$SQLITE_STATIC_LIB" ] && SQLITE_FLAG="-DMOB_STATIC_SQLITE_NIF" + # driver_tab now lives in priv/generated (per-app, regenerated via + # `mix mob.regen_driver_tab --format c`), not $MOB_DIR/ios. $CC $IFLAGS $SQLITE_FLAG \ - -c "$MOB_DIR/ios/driver_tab_ios.c" -o "$BUILD_DIR/driver_tab_ios.o" + -c "priv/generated/driver_tab_ios.c" -o "$BUILD_DIR/driver_tab_ios.o" $CC -fobjc-arc -fmodules $IFLAGS \ -I "$BUILD_DIR" \ @@ -590,6 +534,14 @@ defmodule MobDev.Release do $CC -fobjc-arc -fmodules $IFLAGS \ -c ios/beam_main.m -o "$BUILD_DIR/beam_main.o" + # erl_errno_id stub: BEAM's erl_posix_str.o references + # erl_errno_id_unknown but the bundled OTP doesn't define it. Weak so + # an OTP-internal definition wins if one ever appears. Written with + # printf (not a heredoc) to stay cleanly indentable inside this + # Elixir \""" string. + printf '%s\\n' '__attribute__((weak)) const char *erl_errno_id_unknown(int error) { (void)error; return "unknown"; }' > "$BUILD_DIR/erl_errno_id_compat.c" + $CC $IFLAGS -c "$BUILD_DIR/erl_errno_id_compat.c" -o "$BUILD_DIR/erl_errno_id_compat.o" + echo "=== Linking $APP_NAME (release, no EPMD) ===" xcrun -sdk iphoneos swiftc \ -target arm64-apple-ios17.0 \ @@ -600,6 +552,7 @@ defmodule MobDev.Release do "$BUILD_DIR/mob_beam.o" \ "$BUILD_DIR/AppDelegate.o" \ "$BUILD_DIR/beam_main.o" \ + "$BUILD_DIR/erl_errno_id_compat.o" \ $LIBS \ "$SQLITE_STATIC_LIB" \ -lz -lc++ -lpthread \ @@ -744,11 +697,17 @@ defmodule MobDev.Release do echo "=== Slim strip pass ===" slim_step prefix_libs bash -c ' + # Note: compiler intentionally kept — Ecto.Migrator compiles + # .exs migration files at runtime via Code.compile_file, which + # requires the :compiler OTP app. Stripping it lands a + # `{:badmatch, {:error, :enoent, :"compiler.app"}}` deep in + # application_controller during app boot, so the BEAM never + # reaches the first screen. for prefix in megaco runtime_tools erl_interface os_mon wx et eunit \ observer debugger diameter edoc tools snmp dialyzer \ syntax_tools parsetools xmerl reltool inets ftp tftp \ common_test mnesia eldap odbc \ - compiler ssh; do + ssh; do rm -rf "'"$OTP_BUNDLE"'/lib/$prefix-"* done ' diff --git a/lib/mob_dev/release_android.ex b/lib/mob_dev/release_android.ex index c0ed21a..2a532ae 100644 --- a/lib/mob_dev/release_android.ex +++ b/lib/mob_dev/release_android.ex @@ -61,8 +61,21 @@ defmodule MobDev.ReleaseAndroid do add_app_beams!(staging, app_name) add_app_priv!(staging, app_name) add_exqlite!(staging) - patch_crypto_deps!(staging) - add_crypto_stub!(staging, app_name) + + # Only stub :crypto when the OTP runtime genuinely lacks the + # OpenSSL NIF. When crypto.a is present (the Android CMakeLists.txt + # statically links crypto.a + libcrypto.a and registers + # crypto_nif_init in the driver table), the real :crypto works — + # stubbing it replaces crypto.beam with one whose supports/1 + # returns [], making :ssl.versions/0 raise and breaking every + # HTTPS request (TLS handshake never starts). + if real_crypto_available?(otp_dir) do + log(" real crypto.a present — keeping OpenSSL crypto (no stub)") + else + patch_crypto_deps!(staging) + add_crypto_stub!(staging, app_name) + end + {:ok, staging} {out, _} -> @@ -70,6 +83,16 @@ defmodule MobDev.ReleaseAndroid do end end + # True when the OTP runtime ships the real OpenSSL crypto NIF static + # archive (crypto.a). The Android native build links it into the app + # .so, so the BEAM has working :crypto and must not get the stub. + # Public for testing. + @doc false + @spec real_crypto_available?(Path.t()) :: boolean() + def real_crypto_available?(otp_dir) do + Path.wildcard(Path.join(otp_dir, "erts-*/lib/crypto.a")) != [] + end + # Flatten all runtime BEAMs (app + deps) into {staging}/{app_name}/. # This mirrors how the deployer stages BEAMs for adb push: # all dirs are copied into one flat directory on the -pa code path. diff --git a/test/mob_dev/release_android_test.exs b/test/mob_dev/release_android_test.exs new file mode 100644 index 0000000..0f50d54 --- /dev/null +++ b/test/mob_dev/release_android_test.exs @@ -0,0 +1,47 @@ +defmodule MobDev.ReleaseAndroidTest do + use ExUnit.Case, async: true + + alias MobDev.ReleaseAndroid + + describe "real_crypto_available?/1" do + # Regression guard for the 2026-05-21 Play Console internal-track + # crash: the Android release pipeline unconditionally replaced + # crypto.beam with a stub (supports/1 -> []), assuming the OTP build + # had no OpenSSL NIF. But the Android CMakeLists.txt statically links + # crypto.a + libcrypto.a and registers crypto_nif_init — so the real + # :crypto works. The stub broke :ssl.versions/0 and every HTTPS + # request. The fix gates the stub on the ABSENCE of crypto.a. + + setup do + otp_dir = Path.join(System.tmp_dir!(), "mob_otp_test_#{System.unique_integer([:positive])}") + on_exit(fn -> File.rm_rf!(otp_dir) end) + %{otp_dir: otp_dir} + end + + test "true when erts-*/lib/crypto.a exists", %{otp_dir: otp_dir} do + lib = Path.join(otp_dir, "erts-17.0/lib") + File.mkdir_p!(lib) + File.write!(Path.join(lib, "crypto.a"), "") + + assert ReleaseAndroid.real_crypto_available?(otp_dir) + end + + test "matches whatever erts version directory is present", %{otp_dir: otp_dir} do + lib = Path.join(otp_dir, "erts-16.3/lib") + File.mkdir_p!(lib) + File.write!(Path.join(lib, "crypto.a"), "") + + assert ReleaseAndroid.real_crypto_available?(otp_dir) + end + + test "false when crypto.a is absent (stub path)", %{otp_dir: otp_dir} do + File.mkdir_p!(Path.join(otp_dir, "erts-17.0/lib")) + + refute ReleaseAndroid.real_crypto_available?(otp_dir) + end + + test "false when the OTP dir doesn't exist at all", %{otp_dir: otp_dir} do + refute ReleaseAndroid.real_crypto_available?(otp_dir) + end + end +end diff --git a/test/mob_dev/release_script_test.exs b/test/mob_dev/release_script_test.exs index 84c4855..2fb0322 100644 --- a/test/mob_dev/release_script_test.exs +++ b/test/mob_dev/release_script_test.exs @@ -55,14 +55,13 @@ defmodule MobDev.ReleaseScriptTest do # If apps emerge that DO need one of these, drop it from the strip # set in lib/mob_dev/release.ex AND from this test list. # - # `compiler` and `ssh` were added 2026-05-06 — empirical snapshot - # from a running pigeon iOS-sim build showed 0 of 59 compiler - # modules + 0 of 43 ssh modules ever loaded. Saves ~4.4 MB. - # No mob app should need runtime Code.eval or SSH client/server. + # `ssh` stripped 2026-05-06 — empirical snapshot from a running + # pigeon iOS-sim build showed 0 of 43 ssh modules ever loaded. + # No mob app should need a runtime SSH client/server. for prefix <- ~w(megaco runtime_tools erl_interface os_mon wx et eunit observer debugger diameter edoc tools snmp dialyzer syntax_tools parsetools xmerl reltool inets ftp tftp - compiler ssh) do + ssh) do assert loop_head =~ prefix, "expected the OTP-strip loop to drop #{prefix}-* libs" end @@ -73,6 +72,59 @@ defmodule MobDev.ReleaseScriptTest do # of the single-quoted heredoc — match either form. assert sh =~ ~r/rm -rf "(?:'")?\$OTP_BUNDLE(?:"')?\/lib\/\$prefix-"/ end + + test "does NOT strip compiler — Ecto.Migrator needs it at runtime", %{sh: sh} do + # Regression guard for the 2026-05-21 TestFlight crash: stripping + # compiler-* removed :compiler, which Ecto.Migrator requires to + # Code.compile_file the .exs migrations. The BEAM crashed during + # boot with {:badmatch, {:error, :enoent, :"compiler.app"}} before + # the first screen rendered — splash spinner forever. + [_, after_for] = String.split(sh, "for prefix in ", parts: 2) + [loop_head, _] = String.split(after_for, "; do", parts: 2) + + refute loop_head =~ ~r/\bcompiler\b/, + "compiler must NOT be in the OTP-strip loop — Ecto.Migrator " <> + "compiles .exs migrations at runtime and needs :compiler" + end + end + + describe "native sources kept in sync with the framework" do + test "compiles MobGpuView.swift alongside the other Swift sources", %{sh: sh} do + # MobRootView.swift references the MobGpuView struct; omitting the + # file from swiftc input fails with "cannot find 'MobGpuView' in scope". + assert sh =~ ~s|"$MOB_DIR/ios/MobGpuView.swift"| + end + + test "compiles the per-app generated driver_tab, not $MOB_DIR/ios", %{sh: sh} do + # driver_tab moved to priv/generated (regenerated per-app via + # `mix mob.regen_driver_tab`). The legacy $MOB_DIR/ios/driver_tab_ios.c + # path no longer exists. + assert sh =~ ~s|-c "priv/generated/driver_tab_ios.c"| + refute sh =~ ~s|"$MOB_DIR/ios/driver_tab_ios.c"| + end + + test "links crypto.a + libcrypto.a (crypto_nif_init / TLS)", %{sh: sh} do + assert sh =~ ~s|$OTP_ROOT/$ERTS_VSN/lib/crypto.a| + assert sh =~ ~s|$OTP_ROOT/$ERTS_VSN/lib/libcrypto.a| + end + + test "generates + links the erl_errno_id_unknown weak stub", %{sh: sh} do + # libbeam.a's erl_posix_str.o references erl_errno_id_unknown but the + # bundled OTP doesn't define it → undefined-symbol link error. + assert sh =~ "erl_errno_id_unknown" + assert sh =~ ~s|"$BUILD_DIR/erl_errno_id_compat.o"| + end + + test "does NOT compile the old md5/no-op crypto+ssl shims into BEAMS_DIR", %{sh: sh} do + # The shims shadowed the real crypto-5.9/ssl-11.7 beams on the -pa + # path, breaking TLS (`:ssl.versions/0 undefined`). The runtime ships + # real crypto (linked via crypto.a) + ssl, so the shims must not be + # generated. See decisions/2026-05-25-real-crypto-ssl-on-device.md. + refute sh =~ "Crypto shim for iOS" + refute sh =~ "SSL shim for iOS" + refute sh =~ ~s|erlc -o "$BEAMS_DIR" "$SSL_TMP/ssl.erl"| + refute sh =~ ~s|erlc -o "$BEAMS_DIR" "$CRYPTO_TMP/crypto.erl"| + end end describe "test harness compiled out of release builds" do From c01c1700c225777677d625ec6c4f73acc22b4c35 Mon Sep 17 00:00:00 2001 From: genericjam Date: Tue, 26 May 2026 00:23:41 -0600 Subject: [PATCH 2/3] Warn on build vs device-runtime Elixir minor-version skew MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the compiling Elixir's minor differs from the Elixir bundled in the device OTP tarball, beams can call functions the runtime lacks (e.g. `x in list` -> Enum.__in__/2 under 1.20, absent in 1.19.5) -> :undef at boot -> black screen with no error. OtpDownloader.ensure/3 now compares System.version() against the tarball's elixir.app vsn at major.minor and prints a loud stderr warning on mismatch (warn, not fail — rc/patch diffs are fine and a hard fail would block legit builds). Pure elixir_skew/2 + bundled_elixir_version/1 are tested. Full suite green (1411). See decisions/2026-05-26-elixir-version-skew-warning.md. Co-Authored-By: Claude Opus 4.7 --- .../2026-05-26-elixir-version-skew-warning.md | 29 ++++++ lib/mob_dev/otp_downloader.ex | 88 ++++++++++++++++--- test/mob_dev/otp_downloader_test.exs | 51 +++++++++++ 3 files changed, 156 insertions(+), 12 deletions(-) create mode 100644 decisions/2026-05-26-elixir-version-skew-warning.md diff --git a/decisions/2026-05-26-elixir-version-skew-warning.md b/decisions/2026-05-26-elixir-version-skew-warning.md new file mode 100644 index 0000000..3540382 --- /dev/null +++ b/decisions/2026-05-26-elixir-version-skew-warning.md @@ -0,0 +1,29 @@ +# Warn (don't fail) on build vs device-runtime Elixir minor-version skew + +- Date: 2026-05-26 +- Status: accepted + +## Context +`.tool-versions` pinned Elixir 1.20.0-rc.5, but the device OTP tarball shipped +1.19.5. `x in list` compiles to `Enum.__in__/2` under 1.20, which 1.19.5 lacks, +so `Ecto.Migrator` hit `:undef` at boot — a black screen with no error message. +It cost hours to trace because nothing surfaced the mismatch; the build happily +produced an artifact that couldn't run. `mob_dev` already knows both versions at +build time: `System.version()` (the compiling Elixir) and the `elixir.app` vsn +inside the cached OTP tarball. + +## Decision +At OTP-dir resolution (`OtpDownloader.ensure/3`), compare the two at +**major.minor** granularity and print a loud stderr warning on mismatch — +**warn, not fail**. The pure comparison (`elixir_skew/2`) and the tarball reader +(`bundled_elixir_version/1`) are public + tested. rc/patch differences within a +minor (1.20.0-rc.5 vs 1.20.0, 1.19.5 vs 1.19.6) are beam-compatible and don't +warn. + +## Consequences +- The black-screen class of failure now announces itself in one build line. +- Warn over fail is deliberate: rc/patch toolchain transitions are routine and a + hard fail would block legitimate builds; the cost is that a warning can be + scrolled past (acceptable vs. blocking). +- Fires on every build while a skew persists — the nudge to align + `.tool-versions` with the tarball (or rebuild the tarball). diff --git a/lib/mob_dev/otp_downloader.ex b/lib/mob_dev/otp_downloader.ex index 7f8c325..ca4a877 100644 --- a/lib/mob_dev/otp_downloader.ex +++ b/lib/mob_dev/otp_downloader.ex @@ -52,23 +52,87 @@ defmodule MobDev.OtpDownloader do @spec ios_device_otp_dir() :: String.t() def ios_device_otp_dir, do: cache_dir(@ios_device_name) + @doc """ + Warns (does not fail) when the build's Elixir minor version differs from the + Elixir bundled in the device OTP runtime at `otp_dir`. + + This is the `Enum.__in__/2` class of breakage: the `in` operator and other + macros expand differently per Elixir minor, so beams compiled by 1.20 call + functions a 1.19.5 runtime lacks → `:undef` at boot → black screen with no + obvious cause. Warning (not failing) is deliberate: rc/patch transitions are + common and usually fine, and a hard fail would block legitimate builds — but + a loud warning would have turned that debugging saga into one line. + """ + @spec warn_on_elixir_skew(String.t()) :: :ok + def warn_on_elixir_skew(otp_dir) do + case elixir_skew(System.version(), bundled_elixir_version(otp_dir)) do + :ok -> + :ok + + {:skew, build, bundled} -> + IO.puts(:stderr, [ + IO.ANSI.yellow(), + """ + ⚠ Elixir version skew: building with #{build}, but the device OTP runtime ships #{bundled}. + Beams compiled here may not load on device (e.g. `x in list` compiles to + Enum.__in__/2 under 1.20, which #{bundled} lacks → :undef at boot). + Fix: align .tool-versions to #{bundled}, or rebuild the OTP tarball with #{build}.\ + """, + IO.ANSI.reset() + ]) + + :ok + end + end + + @doc false + @spec elixir_skew(String.t(), String.t() | nil) :: :ok | {:skew, String.t(), String.t()} + def elixir_skew(_build, nil), do: :ok + + def elixir_skew(build, bundled) do + if major_minor(build) == major_minor(bundled), + do: :ok, + else: {:skew, build, bundled} + end + + @doc "Reads the Elixir vsn from `otp_dir/lib/elixir/ebin/elixir.app`, or nil." + @spec bundled_elixir_version(String.t()) :: String.t() | nil + def bundled_elixir_version(otp_dir) do + app = Path.join([otp_dir, "lib", "elixir", "ebin", "elixir.app"]) + + with {:ok, content} <- File.read(app), + [_, vsn] <- Regex.run(~r/\{vsn,\s*"([^"]+)"\}/, content) do + vsn + else + _ -> nil + end + end + # ── Private ────────────────────────────────────────────────────────────────── + # major.minor as a 2-element list, dropping any -pre/+build on the patch. + # "1.20.0-rc.5" -> ["1", "20"]; "1.19.5" -> ["1", "19"]. + defp major_minor(vsn), do: vsn |> String.split(".") |> Enum.take(2) + defp ensure(name, tarball) do dir = cache_dir(name) - if valid_otp_dir?(dir, name) do - {:ok, dir} - else - # Remove stale/incomplete directory before re-downloading. - # Two cases here: - # 1. previous download attempt failed mid-extraction (Nix curl, flaky net) - # 2. cached tarball predates a schema change — e.g. iOS device tarball - # now ships EPMD source under `erts/epmd/src/`. Re-download picks up - # the new asset at the same URL (same OTP hash, new revision uploaded). - if File.dir?(dir), do: File.rm_rf!(dir) - download_and_extract(name, tarball, dir) - end + result = + if valid_otp_dir?(dir, name) do + {:ok, dir} + else + # Remove stale/incomplete directory before re-downloading. + # Two cases here: + # 1. previous download attempt failed mid-extraction (Nix curl, flaky net) + # 2. cached tarball predates a schema change — e.g. iOS device tarball + # now ships EPMD source under `erts/epmd/src/`. Re-download picks up + # the new asset at the same URL (same OTP hash, new revision uploaded). + if File.dir?(dir), do: File.rm_rf!(dir) + download_and_extract(name, tarball, dir) + end + + with {:ok, otp_dir} <- result, do: warn_on_elixir_skew(otp_dir) + result end # A valid extracted OTP dir must contain at least one erts-* subdirectory. diff --git a/test/mob_dev/otp_downloader_test.exs b/test/mob_dev/otp_downloader_test.exs index c3f7602..73dde1c 100644 --- a/test/mob_dev/otp_downloader_test.exs +++ b/test/mob_dev/otp_downloader_test.exs @@ -132,4 +132,55 @@ defmodule MobDev.OtpDownloaderTest do refute OtpDownloader.valid_otp_dir?("/nonexistent/path", "otp-ios-device-7721ab74") end end + + # ── Elixir build/runtime version skew ─────────────────────────────────────── + # + # Build Elixir ≠ device-runtime Elixir at the minor level is the Enum.__in__/2 + # class of breakage (black screen at boot). We compare major.minor: rc/patch + # differences within a minor are beam-compatible and must NOT warn. + + describe "elixir_skew/2" do + test "different minor is a skew" do + assert OtpDownloader.elixir_skew("1.20.0-rc.5", "1.19.5") == + {:skew, "1.20.0-rc.5", "1.19.5"} + end + + test "identical version is ok" do + assert OtpDownloader.elixir_skew("1.19.5", "1.19.5") == :ok + end + + test "same minor, different patch is ok" do + assert OtpDownloader.elixir_skew("1.19.5", "1.19.6") == :ok + end + + test "same minor, rc vs final is ok" do + assert OtpDownloader.elixir_skew("1.20.0-rc.5", "1.20.0") == :ok + end + + test "nil bundled version (unreadable) does not warn" do + assert OtpDownloader.elixir_skew("1.20.0", nil) == :ok + end + end + + describe "bundled_elixir_version/1" do + setup do + tmp = Path.join(System.tmp_dir!(), "otp_elixir_vsn_#{System.unique_integer([:positive])}") + File.mkdir_p!(Path.join(tmp, "lib/elixir/ebin")) + on_exit(fn -> File.rm_rf!(tmp) end) + {:ok, tmp: tmp} + end + + test "reads the vsn from lib/elixir/ebin/elixir.app", %{tmp: tmp} do + File.write!( + Path.join(tmp, "lib/elixir/ebin/elixir.app"), + ~s|{application,elixir,[{description,"elixir"},{vsn,"1.19.5"},{modules,[]}]}.| + ) + + assert OtpDownloader.bundled_elixir_version(tmp) == "1.19.5" + end + + test "missing elixir.app returns nil", %{tmp: tmp} do + assert OtpDownloader.bundled_elixir_version(tmp) == nil + end + end end From daaa5373f127be3d742e909dd762220d31e05b57 Mon Sep 17 00:00:00 2001 From: genericjam Date: Tue, 26 May 2026 21:20:28 -0600 Subject: [PATCH 3/3] iOS release build: glob $MOB_DIR/ios/*.swift instead of listing files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The release swiftc input hardcoded MobViewModel/MobRootView/MobGpuView.swift. When a new mob Swift file lands (e.g. MobGpuView.swift, which MobRootView references), a build that doesn't list it fails with "cannot find '' in scope" — that's how master's iOS build broke. Globbing ios/*.swift auto-includes new files, so adding a mob Swift source never again requires a release.ex edit. Update the release-script test to assert the glob (and that the per-file list is gone) instead of the literal MobGpuView.swift line. Co-Authored-By: Claude Opus 4.7 --- lib/mob_dev/release.ex | 4 +--- test/mob_dev/release_script_test.exs | 12 ++++++++---- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/lib/mob_dev/release.ex b/lib/mob_dev/release.ex index f11daae..6b12d97 100644 --- a/lib/mob_dev/release.ex +++ b/lib/mob_dev/release.ex @@ -500,9 +500,7 @@ defmodule MobDev.Release do -I "$MOB_DIR/ios" \ -parse-as-library -wmo \ -O \ - "$MOB_DIR/ios/MobViewModel.swift" \ - "$MOB_DIR/ios/MobRootView.swift" \ - "$MOB_DIR/ios/MobGpuView.swift" \ + "$MOB_DIR"/ios/*.swift \ -c -o "$BUILD_DIR/swift_mob.o" # MOB_RELEASE on mob_nif.m strips the test harness (synthetic-input diff --git a/test/mob_dev/release_script_test.exs b/test/mob_dev/release_script_test.exs index 2fb0322..49007c6 100644 --- a/test/mob_dev/release_script_test.exs +++ b/test/mob_dev/release_script_test.exs @@ -89,10 +89,14 @@ defmodule MobDev.ReleaseScriptTest do end describe "native sources kept in sync with the framework" do - test "compiles MobGpuView.swift alongside the other Swift sources", %{sh: sh} do - # MobRootView.swift references the MobGpuView struct; omitting the - # file from swiftc input fails with "cannot find 'MobGpuView' in scope". - assert sh =~ ~s|"$MOB_DIR/ios/MobGpuView.swift"| + test "globs all mob Swift sources so new files auto-compile", %{sh: sh} do + # The release swiftc input globs $MOB_DIR/ios/*.swift rather than listing + # files by name, so a newly-added mob Swift file (e.g. MobGpuView.swift, + # which MobRootView references) is compiled without a release.ex edit. + # Listing files by name is exactly how master's iOS build broke when + # MobGpuView.swift landed but older builds didn't know to compile it. + assert sh =~ ~s|"$MOB_DIR"/ios/*.swift| + refute sh =~ ~s|"$MOB_DIR/ios/MobRootView.swift"| end test "compiles the per-app generated driver_tab, not $MOB_DIR/ios", %{sh: sh} do