Coldwire — a 100% on-device private alpha agent

Your watchlist, open positions and strategy never leave your machine. Coldwire reads your private trader notes and produces a structured signal report using local inference + RAG via @qvac/sdk. Zero cloud. $0 inference. Works offline.

Built for the Tether QVAC "Unleash Edge AI" hackathon. Apache-2.0.

Live showcase (static snapshot of a real on-device run): coldwire-edge-ai.vercel.app — this page only displays a report generated 100% on-device; no inference runs on it.

Why edge?

A crypto trader's edge is their private data — positions, sizing, strategy, the rules they actually trade by. Sending that to a cloud LLM leaks the one thing that matters. Coldwire keeps all of it on the device:

Private — positions/strategy are embedded and reasoned over locally; nothing is uploaded.
Offline — after a one-time model download, it runs with the network off (airplane mode).
$0 / no lock-in — no API bills, no keys, no vendor. You own the whole stack.

Hard guarantees (verifiable)

All AI inference runs through @qvac/sdk on this device. Embeddings (RAG) and text generation both execute on the local QVAC runtime.
No cloud inference, no telemetry, no remote calls at runtime. The only network use is the SDK's one-time model download. Enforced by a CI guard: npm run check:no-cloud.
Open source (Apache-2.0), reproducible from a clean clone, Node ≥ 22.17, TypeScript, ESM.

What it does (pipeline)

 data/sample/*.md  ──►  GTE-large embeddings  ──►  local RAG (HyperDB workspace)
 (watchlist,             (@qvac/sdk, on-device)        │
  positions,                                           ▼
  strategy)                              per-asset retrieve  +  exact note section
                                                       │
                                                       ▼
                              Llama 3.2 1B (JSON-schema-constrained completion)
                                                       │
                                          deterministic bias-grounding
                                                       ▼
                          structured Signal Report  +  on-device proof
                       (per asset: stance · conviction · thesis · risk · action)

Every embedding and every completion is a local @qvac/sdk call. RAG retrieval grounds each signal in your real notes; a small deterministic layer guarantees the agent never contradicts your own stated Bias: line.

Reproduce from zero

Requirements: Node ≥ 22.17, npm ≥ 10.9. (Tested on Node v24.14.1.)

git clone https://github.com/tang-vu/coldwire.git && cd coldwire
npm install

# 1) Smoke test — confirms the toolchain + on-device model load + streaming
npm run smoke

# 2) Run the agent on the synthetic sample data (downloads models on first run)
npm run coldwire

# 3) Or use the web UI (zero external deps) — open http://127.0.0.1:8787
npm run serve

First run downloads two models (~1.4 GB total) into ~/.qvac/models and checksum-validates them; subsequent runs are fully offline and take ~60s.

Useful flags:

npm run coldwire -- --data data/sample   # point at your own private docs dir
npm run coldwire -- --json               # raw JSON SignalReport
npm run check:no-cloud                    # prove there are no cloud endpoints
npm run typecheck                         # tsc --noEmit

Models (how they're fetched)

Role	Constant (`@qvac/sdk`)	File	Size
Reasoning LLM	`LLAMA_3_2_1B_INST_Q4_0`	`Llama-3.2-1B-Instruct-Q4_0.gguf`	~773 MB
Embeddings (RAG)	`GTE_LARGE_FP16` (1024-d)	GTE-large fp16 GGUF	~670 MB

Models are downloaded by the SDK from the QVAC model registry to ~/.qvac/models on first use and sha256 checksum-validated (you'll see Checksum validated in the logs). No model weights are committed to this repo.

On-device proof (how to verify it's local)

Coldwire makes "it ran locally" auditable:

Per-call telemetry — each run prints device=gpu|cpu and tokens/sec per inference call (from the SDK profiler). Cloud calls can't report a local backend device.
Profiler export — a full QVAC profiler JSON is written to proof/coldwire-proof-<timestamp>.json every run.
Airplane-mode test — after the first model download, disable your network and run npm run coldwire again. It still generates signals. (This is the strongest proof; see SUBMISSION_CHECKLIST.md.)

Example proof block (real output on the test machine):

ON-DEVICE PROOF
host:        DESKTOP-1A6OPC9
llm:         LLAMA_3_2_1B_INST_Q4_0
embeddings:  GTE_LARGE_FP16
  • signal:SOL    device=gpu  40.5 tok/s  117 gen
  • portfolio     device=gpu  48.4 tok/s  92 gen

Remote network use (full list)

The app makes no cloud inference calls. The only network activity is:

Model download (one-time): @qvac/sdk fetches the model weights from the QVAC model registry on first run into ~/.qvac/models, and sha256-validates them.
P2P DHT (optional): only when you pass --delegate, the Holepunch/Hyperswarm DHT is used to reach a trusted peer. Off by default.

No third-party LLM/embedding APIs are used anywhere. Enforced by npm run check:no-cloud.

Audit log

Every run writes a structured audit log to proof/coldwire-proof-<timestamp>.json, capturing:

Model lifecycle — load/unload events per model, with timings (ms).
Per inference call — prompt preview + prompt chars/tokens, generated tokens, TTFT (ms), tokens/sec, and the backend device (cpu/gpu).
The full QVAC profiler export.

A sample from one real demo run is committed at docs/sample-on-device-run.json.

Sample-data walkthrough

data/sample/ ships synthetic (fake) trader data so the repo runs with no secrets:

watchlist.md — assets with a ## TICKER — Name heading + a Bias: line.
positions.md — open positions, sizes, weights, stops.
strategy.md — the trader's rules + conviction framework.

On this data Coldwire correctly surfaces real rule breaches, e.g. "SOL is 36% of NAV, above my 25% single-alt cap" and "total alt exposure 53.7% > 50% cap", and emits stances consistent with each asset's stated bias (e.g. DOGE → avoid).

To use your own data: drop .md/.txt files in a directory (keep the ## TICKER — Name + Bias: convention in your watchlist) and run npm run coldwire -- --data path/to/your/docs. Put real notes under data/private/ — it's git-ignored.

Optional: P2P delegated inference (feature-flagged)

A node can offload the heavy LLM completion to a trusted peer (your own second device) over Holepunch's encrypted P2P transport — no server, no cloud. Embeddings + RAG always stay local, so your private vector context never leaves the machine.

# Device B (provider) — prints a public key:
npm run provider

# Device A (consumer) — offload the LLM to that peer:
npm run coldwire -- --delegate <provider-public-key>

Core-safety: delegation always uses fallbackToLocal: true, so if the peer is unreachable Coldwire transparently runs the LLM locally. P2P can never break the core pipeline. It is entirely off unless you pass --delegate.

Project layout

src/core/    parsing · RAG · models · prompts · bias-grounding · agent · proof
src/cli/     coldwire-cli.ts (+ formatted/JSON rendering)
src/server/  zero-dep HTTP web UI with SSE live progress
src/p2p/     delegated-inference helpers + provider node (Phase 2)
scripts/     smoke-test.ts · verify-no-cloud-endpoints.ts
data/sample/ synthetic watchlist / positions / strategy

License

Apache-2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
data/sample		data/sample
docs		docs
media		media
plans/260620-2254-coldwire-on-device-alpha-agent		plans/260620-2254-coldwire-on-device-alpha-agent
scripts		scripts
showcase		showcase
src		src
test		test
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SUBMISSION_CHECKLIST.md		SUBMISSION_CHECKLIST.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coldwire — a 100% on-device private alpha agent

Why edge?

Hard guarantees (verifiable)

What it does (pipeline)

Reproduce from zero

Models (how they're fetched)

On-device proof (how to verify it's local)

Remote network use (full list)

Audit log

Sample-data walkthrough

Optional: P2P delegated inference (feature-flagged)

Project layout

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Coldwire — a 100% on-device private alpha agent

Why edge?

Hard guarantees (verifiable)

What it does (pipeline)

Reproduce from zero

Models (how they're fetched)

On-device proof (how to verify it's local)

Remote network use (full list)

Audit log

Sample-data walkthrough

Optional: P2P delegated inference (feature-flagged)

Project layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages