Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions decisions/2026-05-25-keep-compiler-in-slim-strip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Keep the :compiler OTP app in the slimmed device runtime

- Date: 2026-05-25
- Status: accepted

## Context
The iOS release slim-strip drops unused OTP apps to shrink the bundle, and
`compiler-*` looks unused at runtime — nothing the app calls references it
directly. But Ecto.Migrator compiles `.exs` migration files at runtime via
`Code.compile_file/1`, which needs the `:compiler` OTP app. Any app that runs
migrations on boot (the common Mob + ecto_sqlite3 pattern) depends on it.

Stripping it surfaced as `{:badmatch, {:error, :enoent, :"compiler.app"}}` deep
in `:application_controller` during boot — the BEAM never reached
`Mob.Screen.start_root`, so the app hung on the splash with no obvious cause.

## Decision
Remove `compiler` from the slim-strip prefix list in `release.ex`; keep it in
the device runtime.

## Consequences
- A few MB larger bundle, in exchange for apps that run runtime migrations
actually booting.
- Documented inline at the strip list so it isn't "re-optimized" away later.
- Apps that don't compile code at runtime carry compiler unnecessarily — minor,
and not worth a per-app flag for the size saved.
37 changes: 37 additions & 0 deletions decisions/2026-05-25-real-crypto-ssl-on-device.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Ship real OpenSSL crypto + ssl on device, not md5/no-op shims

- Date: 2026-05-25
- Status: accepted

## Context
Early device OTP runtimes were built `--without-ssl`, so the release scripts
compiled stand-in `:crypto` and `:ssl` modules into the app beam dir: an
md5-only crypto (`supports/1 -> []`, fake `generate_key`/`sign`) and an ssl
that only exported `start`/`stop`. They existed purely so
`ensure_all_started/1` wouldn't fail for HTTP-only loopback Phoenix.

Those shims make real TLS impossible. On the Code-To-Cloud orchestra app the
phone fetches stems and streams SSE from `https://c0.boltbrain.ca`, so it needs
working TLS. With the shims, iOS hit `:ssl.versions/0 undefined` (the shim
lacks it) and Android's stubbed `crypto.supports/1 -> []` made `:ssl.versions/0`
raise — every HTTPS connect crashed. Meanwhile the current OTP tarballs *do*
ship real `crypto-5.9` + `ssl-11.7`, and the native builds already link the
OpenSSL static archive (`crypto.a` + `libcrypto.a`) and register the crypto NIF.

## Decision
Use the real beams. Android (`release_android.ex`) gates on
`real_crypto_available?/1` — only stub when the runtime genuinely has no
`crypto.a`; otherwise keep the OpenSSL crypto. iOS (`release.ex`) stops
compiling the shim crypto/ssl into `BEAMS_DIR` (they shadowed the real
`lib/{crypto,ssl}-*/ebin` on the prepended `-pa` path), and links
`crypto.a`/`libcrypto.a` so the NIF resolves.

## Consequences
- Real `verify_peer` TLS works on device; orchestra SSE + stem download connect.
- The shims are gone; `supports/1` returns real algorithms, `:ssl.versions/0`
works.
- Hard dependency: the device OTP tarball must ship `crypto.a` and the real
`crypto`/`ssl` beams. If a future `--without-ssl` tarball reappears,
`real_crypto_available?/1` falls back to the Android stub; iOS would need the
shim path restored.
- Apps that only need loopback HTTP are unaffected (the real beams still load).
29 changes: 29 additions & 0 deletions decisions/2026-05-26-elixir-version-skew-warning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Warn (don't fail) on build vs device-runtime Elixir minor-version skew

- Date: 2026-05-26
- Status: accepted

## Context
`.tool-versions` pinned Elixir 1.20.0-rc.5, but the device OTP tarball shipped
1.19.5. `x in list` compiles to `Enum.__in__/2` under 1.20, which 1.19.5 lacks,
so `Ecto.Migrator` hit `:undef` at boot — a black screen with no error message.
It cost hours to trace because nothing surfaced the mismatch; the build happily
produced an artifact that couldn't run. `mob_dev` already knows both versions at
build time: `System.version()` (the compiling Elixir) and the `elixir.app` vsn
inside the cached OTP tarball.

## Decision
At OTP-dir resolution (`OtpDownloader.ensure/3`), compare the two at
**major.minor** granularity and print a loud stderr warning on mismatch —
**warn, not fail**. The pure comparison (`elixir_skew/2`) and the tarball reader
(`bundled_elixir_version/1`) are public + tested. rc/patch differences within a
minor (1.20.0-rc.5 vs 1.20.0, 1.19.5 vs 1.19.6) are beam-compatible and don't
warn.

## Consequences
- The black-screen class of failure now announces itself in one build line.
- Warn over fail is deliberate: rc/patch toolchain transitions are routine and a
hard fail would block legitimate builds; the cost is that a warning can be
scrolled past (acceptable vs. blocking).
- Fires on every build while a skew persists — the nudge to align
`.tool-versions` with the tarball (or rebuild the tarball).
88 changes: 76 additions & 12 deletions lib/mob_dev/otp_downloader.ex
Original file line number Diff line number Diff line change
Expand Up @@ -52,23 +52,87 @@ defmodule MobDev.OtpDownloader do
@spec ios_device_otp_dir() :: String.t()
def ios_device_otp_dir, do: cache_dir(@ios_device_name)

@doc """
Warns (does not fail) when the build's Elixir minor version differs from the
Elixir bundled in the device OTP runtime at `otp_dir`.

This is the `Enum.__in__/2` class of breakage: the `in` operator and other
macros expand differently per Elixir minor, so beams compiled by 1.20 call
functions a 1.19.5 runtime lacks → `:undef` at boot → black screen with no
obvious cause. Warning (not failing) is deliberate: rc/patch transitions are
common and usually fine, and a hard fail would block legitimate builds — but
a loud warning would have turned that debugging saga into one line.
"""
@spec warn_on_elixir_skew(String.t()) :: :ok
def warn_on_elixir_skew(otp_dir) do
case elixir_skew(System.version(), bundled_elixir_version(otp_dir)) do
:ok ->
:ok

{:skew, build, bundled} ->
IO.puts(:stderr, [
IO.ANSI.yellow(),
"""
⚠ Elixir version skew: building with #{build}, but the device OTP runtime ships #{bundled}.
Beams compiled here may not load on device (e.g. `x in list` compiles to
Enum.__in__/2 under 1.20, which #{bundled} lacks → :undef at boot).
Fix: align .tool-versions to #{bundled}, or rebuild the OTP tarball with #{build}.\
""",
IO.ANSI.reset()
])

:ok
end
end

@doc false
@spec elixir_skew(String.t(), String.t() | nil) :: :ok | {:skew, String.t(), String.t()}
def elixir_skew(_build, nil), do: :ok

def elixir_skew(build, bundled) do
if major_minor(build) == major_minor(bundled),
do: :ok,
else: {:skew, build, bundled}
end

@doc "Reads the Elixir vsn from `otp_dir/lib/elixir/ebin/elixir.app`, or nil."
@spec bundled_elixir_version(String.t()) :: String.t() | nil
def bundled_elixir_version(otp_dir) do
app = Path.join([otp_dir, "lib", "elixir", "ebin", "elixir.app"])

with {:ok, content} <- File.read(app),
[_, vsn] <- Regex.run(~r/\{vsn,\s*"([^"]+)"\}/, content) do
vsn
else
_ -> nil
end
end

# ── Private ──────────────────────────────────────────────────────────────────

# major.minor as a 2-element list, dropping any -pre/+build on the patch.
# "1.20.0-rc.5" -> ["1", "20"]; "1.19.5" -> ["1", "19"].
defp major_minor(vsn), do: vsn |> String.split(".") |> Enum.take(2)

defp ensure(name, tarball) do
dir = cache_dir(name)

if valid_otp_dir?(dir, name) do
{:ok, dir}
else
# Remove stale/incomplete directory before re-downloading.
# Two cases here:
# 1. previous download attempt failed mid-extraction (Nix curl, flaky net)
# 2. cached tarball predates a schema change — e.g. iOS device tarball
# now ships EPMD source under `erts/epmd/src/`. Re-download picks up
# the new asset at the same URL (same OTP hash, new revision uploaded).
if File.dir?(dir), do: File.rm_rf!(dir)
download_and_extract(name, tarball, dir)
end
result =
if valid_otp_dir?(dir, name) do
{:ok, dir}
else
# Remove stale/incomplete directory before re-downloading.
# Two cases here:
# 1. previous download attempt failed mid-extraction (Nix curl, flaky net)
# 2. cached tarball predates a schema change — e.g. iOS device tarball
# now ships EPMD source under `erts/epmd/src/`. Re-download picks up
# the new asset at the same URL (same OTP hash, new revision uploaded).
if File.dir?(dir), do: File.rm_rf!(dir)
download_and_extract(name, tarball, dir)
end

with {:ok, otp_dir} <- result, do: warn_on_elixir_skew(otp_dir)
result
end

# A valid extracted OTP dir must contain at least one erts-* subdirectory.
Expand Down
107 changes: 32 additions & 75 deletions lib/mob_dev/release.ex
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,8 @@ defmodule MobDev.Release do
$OTP_ROOT/$ERTS_VSN/lib/libepcre.a
$OTP_ROOT/$ERTS_VSN/lib/libryu.a
$OTP_ROOT/$ERTS_VSN/lib/asn1rt_nif.a
$OTP_ROOT/$ERTS_VSN/lib/crypto.a
$OTP_ROOT/$ERTS_VSN/lib/libcrypto.a
"

echo "=== Compiling Erlang/Elixir ==="
Expand Down Expand Up @@ -420,77 +422,16 @@ defmodule MobDev.Release do
rm -rf "$BUILD_DIR_TMP"
fi

# Crypto + SSL shims (same as build_device.sh — see build_device.sh comments)
CRYPTO_TMP=$(mktemp -d)
cat > "$CRYPTO_TMP/crypto.erl" << 'ERLEOF'
-module(crypto).
-behaviour(application).
-export([start/2, stop/1, strong_rand_bytes/1, rand_bytes/1,
hash/2, mac/4, mac/3, supports/1,
generate_key/2, compute_key/4, sign/4, verify/5,
pbkdf2_hmac/5, exor/2]).
start(_Type, _Args) -> {ok, self()}.
stop(_State) -> ok.
strong_rand_bytes(N) -> rand:bytes(N).
rand_bytes(N) -> rand:bytes(N).
hash(_Type, Data) -> erlang:md5(iolist_to_binary(Data)).
supports(_Type) -> [].
generate_key(_Alg, _Params) -> {<<>>, <<>>}.
compute_key(_Alg, _OtherKey, _MyKey, _Params) -> <<>>.
sign(_Alg, _DigestType, _Msg, _Key) -> <<>>.
verify(_Alg, _DigestType, _Msg, _Signature, _Key) -> true.
mac(hmac, _HashAlg, Key, Data) ->
hmac_md5(iolist_to_binary(Key), iolist_to_binary(Data));
mac(_Type, _SubType, _Key, _Data) -> <<>>.
mac(_Type, _Key, _Data) -> <<>>.
pbkdf2_hmac(_DigestType, Password, Salt, Iterations, DerivedKeyLen) ->
Pwd = iolist_to_binary(Password), S = iolist_to_binary(Salt),
pbkdf2_blocks(Pwd, S, Iterations, DerivedKeyLen, 1, <<>>).
pbkdf2_blocks(_Pwd, _Salt, _Iter, Len, _Block, Acc) when byte_size(Acc) >= Len ->
binary:part(Acc, 0, Len);
pbkdf2_blocks(Pwd, Salt, Iter, Len, Block, Acc) ->
U1 = hmac_md5(Pwd, <<Salt/binary, Block:32/unsigned-big-integer>>),
Ux = pbkdf2_iterate(Pwd, Iter - 1, U1, U1),
pbkdf2_blocks(Pwd, Salt, Iter, Len, Block + 1, <<Acc/binary, Ux/binary>>).
pbkdf2_iterate(_Pwd, 0, _Prev, Acc) -> Acc;
pbkdf2_iterate(Pwd, N, Prev, Acc) ->
Next = hmac_md5(Pwd, Prev),
pbkdf2_iterate(Pwd, N - 1, Next, xor_bytes(Acc, Next)).
hmac_md5(Key0, Data) ->
BlockSize = 64,
Key = if byte_size(Key0) > BlockSize -> erlang:md5(Key0); true -> Key0 end,
PadLen = BlockSize - byte_size(Key),
K = <<Key/binary, 0:(PadLen * 8)>>,
IPad = xor_bytes(K, binary:copy(<<16#36>>, BlockSize)),
OPad = xor_bytes(K, binary:copy(<<16#5C>>, BlockSize)),
erlang:md5(<<OPad/binary, (erlang:md5(<<IPad/binary, Data/binary>>))/binary>>).
exor(A, B) -> xor_bytes(iolist_to_binary(A), iolist_to_binary(B)).
xor_bytes(A, B) -> xor_bytes(A, B, []).
xor_bytes(<<X, Ra/binary>>, <<Y, Rb/binary>>, Acc) ->
xor_bytes(Ra, Rb, [X bxor Y | Acc]);
xor_bytes(<<>>, <<>>, Acc) -> list_to_binary(lists:reverse(Acc)).
ERLEOF
erlc -o "$BEAMS_DIR" "$CRYPTO_TMP/crypto.erl"
cat > "$BEAMS_DIR/crypto.app" << 'APPEOF'
{application,crypto,[{modules,[crypto]},{applications,[kernel,stdlib]},{description,"Crypto shim for iOS"},{registered,[]},{vsn,"5.6"},{mod,{crypto,[]}}]}.
APPEOF
rm -rf "$CRYPTO_TMP"

SSL_TMP=$(mktemp -d)
cat > "$SSL_TMP/ssl.erl" << 'SSLEOF'
-module(ssl).
-behaviour(application).
-export([start/2, stop/1, start/0, stop/0]).
start(_Type, _Args) -> Pid = spawn(fun() -> receive stop -> ok end end), {ok, Pid}.
stop(_State) -> ok.
start() -> ok.
stop() -> ok.
SSLEOF
erlc -o "$BEAMS_DIR" "$SSL_TMP/ssl.erl"
cat > "$BEAMS_DIR/ssl.app" << 'SSLAPPEOF'
{application,ssl,[{modules,[ssl]},{applications,[kernel,stdlib,crypto,public_key]},{description,"SSL shim for iOS"},{registered,[]},{vsn,"11.2"},{mod,{ssl,[]}}]}.
SSLAPPEOF
rm -rf "$SSL_TMP"
# Real crypto + ssl (no shims). The iOS OTP cache ships crypto-5.9 and
# ssl-11.7 (NOT in the slim-strip list below) and the crypto NIF is
# statically linked via crypto.a, so the real beams work on device. The
# old md5-only crypto shim + no-op ssl shim used to be compiled into
# BEAMS_DIR, where (being on the prepended -pa path) they SHADOWED the
# real beams in lib/{crypto,ssl}-*/ebin. That broke TLS: real ssl needs
# ciphers crypto can't provide, and the ssl shim didn't even export
# versions/0 — so Mint hit `:ssl.versions/0 undefined`, every HTTPS
# request crashed, and the orchestra SSE never connected on device.
# Removing the shims lets the real, NIF-backed crypto + ssl load.

echo "=== Copying Elixir stdlib ==="
mkdir -p "$OTP_ROOT/lib/elixir/ebin" "$OTP_ROOT/lib/logger/ebin"
Expand Down Expand Up @@ -559,8 +500,7 @@ defmodule MobDev.Release do
-I "$MOB_DIR/ios" \
-parse-as-library -wmo \
-O \
"$MOB_DIR/ios/MobViewModel.swift" \
"$MOB_DIR/ios/MobRootView.swift" \
"$MOB_DIR"/ios/*.swift \
-c -o "$BUILD_DIR/swift_mob.o"

# MOB_RELEASE on mob_nif.m strips the test harness (synthetic-input
Expand All @@ -580,8 +520,10 @@ defmodule MobDev.Release do

SQLITE_FLAG=""
[ -n "$SQLITE_STATIC_LIB" ] && SQLITE_FLAG="-DMOB_STATIC_SQLITE_NIF"
# driver_tab now lives in priv/generated (per-app, regenerated via
# `mix mob.regen_driver_tab --format c`), not $MOB_DIR/ios.
$CC $IFLAGS $SQLITE_FLAG \
-c "$MOB_DIR/ios/driver_tab_ios.c" -o "$BUILD_DIR/driver_tab_ios.o"
-c "priv/generated/driver_tab_ios.c" -o "$BUILD_DIR/driver_tab_ios.o"

$CC -fobjc-arc -fmodules $IFLAGS \
-I "$BUILD_DIR" \
Expand All @@ -590,6 +532,14 @@ defmodule MobDev.Release do
$CC -fobjc-arc -fmodules $IFLAGS \
-c ios/beam_main.m -o "$BUILD_DIR/beam_main.o"

# erl_errno_id stub: BEAM's erl_posix_str.o references
# erl_errno_id_unknown but the bundled OTP doesn't define it. Weak so
# an OTP-internal definition wins if one ever appears. Written with
# printf (not a heredoc) to stay cleanly indentable inside this
# Elixir \""" string.
printf '%s\\n' '__attribute__((weak)) const char *erl_errno_id_unknown(int error) { (void)error; return "unknown"; }' > "$BUILD_DIR/erl_errno_id_compat.c"
$CC $IFLAGS -c "$BUILD_DIR/erl_errno_id_compat.c" -o "$BUILD_DIR/erl_errno_id_compat.o"

echo "=== Linking $APP_NAME (release, no EPMD) ==="
xcrun -sdk iphoneos swiftc \
-target arm64-apple-ios17.0 \
Expand All @@ -600,6 +550,7 @@ defmodule MobDev.Release do
"$BUILD_DIR/mob_beam.o" \
"$BUILD_DIR/AppDelegate.o" \
"$BUILD_DIR/beam_main.o" \
"$BUILD_DIR/erl_errno_id_compat.o" \
$LIBS \
"$SQLITE_STATIC_LIB" \
-lz -lc++ -lpthread \
Expand Down Expand Up @@ -744,11 +695,17 @@ defmodule MobDev.Release do
echo "=== Slim strip pass ==="

slim_step prefix_libs bash -c '
# Note: compiler intentionally kept — Ecto.Migrator compiles
# .exs migration files at runtime via Code.compile_file, which
# requires the :compiler OTP app. Stripping it lands a
# `{:badmatch, {:error, :enoent, :"compiler.app"}}` deep in
# application_controller during app boot, so the BEAM never
# reaches the first screen.
for prefix in megaco runtime_tools erl_interface os_mon wx et eunit \
observer debugger diameter edoc tools snmp dialyzer \
syntax_tools parsetools xmerl reltool inets ftp tftp \
common_test mnesia eldap odbc \
compiler ssh; do
ssh; do
rm -rf "'"$OTP_BUNDLE"'/lib/$prefix-"*
done
'
Expand Down
27 changes: 25 additions & 2 deletions lib/mob_dev/release_android.ex
Original file line number Diff line number Diff line change
Expand Up @@ -61,15 +61,38 @@ defmodule MobDev.ReleaseAndroid do
add_app_beams!(staging, app_name)
add_app_priv!(staging, app_name)
add_exqlite!(staging)
patch_crypto_deps!(staging)
add_crypto_stub!(staging, app_name)

# Only stub :crypto when the OTP runtime genuinely lacks the
# OpenSSL NIF. When crypto.a is present (the Android CMakeLists.txt
# statically links crypto.a + libcrypto.a and registers
# crypto_nif_init in the driver table), the real :crypto works —
# stubbing it replaces crypto.beam with one whose supports/1
# returns [], making :ssl.versions/0 raise and breaking every
# HTTPS request (TLS handshake never starts).
if real_crypto_available?(otp_dir) do
log(" real crypto.a present — keeping OpenSSL crypto (no stub)")
else
patch_crypto_deps!(staging)
add_crypto_stub!(staging, app_name)
end

{:ok, staging}

{out, _} ->
{:error, "Failed to copy OTP tree: #{out}"}
end
end

# True when the OTP runtime ships the real OpenSSL crypto NIF static
# archive (crypto.a). The Android native build links it into the app
# .so, so the BEAM has working :crypto and must not get the stub.
# Public for testing.
@doc false
@spec real_crypto_available?(Path.t()) :: boolean()
def real_crypto_available?(otp_dir) do
Path.wildcard(Path.join(otp_dir, "erts-*/lib/crypto.a")) != []
end

# Flatten all runtime BEAMs (app + deps) into {staging}/{app_name}/.
# This mirrors how the deployer stages BEAMs for adb push:
# all dirs are copied into one flat directory on the -pa code path.
Expand Down
Loading
Loading