feature(ssh-tunnel): backend service and client API routes (Part 3 / Steps 1-3 + 4a) by CodeLieutenant · Pull Request #1 · CodeLieutenant/argus

CodeLieutenant · 2026-04-15T11:01:49Z

Context

This PR implements Part 3 of the SSH Tunnel design (docs/plans/ssh-tunnel-design.md) — the server-side foundation that must be in place before any client-side work (Steps 5/6) or proxy-host provisioning (Steps 4c/7b) can be tested end-to-end.

The goal of this stage: a client can POST a public key to Argus and receive proxy tunnel connection details in return; the proxy host can GET all valid public keys at any time via AuthorizedKeysCommand.

What's in this PR

Step 1 + 2 — DB models (`argus/backend/models/ssh_key.py`)

Two new CQLEngine models added to USED_MODELS (auto-created by sync_models/test startup, no manual CQL migration needed):

SSHTunnelKey
Stores a client-registered public key scoped to a (user_id, tunnel_id) pair.

Inserted with a ScyllaDB TTL (default 24 h, overridable via ttl_seconds). Expired rows are auto-deleted — no cleanup job needed.
expires_at is stored as an informational timestamp so clients know when to re-register.
fingerprint is computed server-side (SHA256 of raw key bytes, SHA256:<base64> format) — the private key never touches the server.

ProxyTunnelConfig
Stores the connection details of an SSH proxy tunnel server (host, port, proxy_user, target_host, target_port, host_key_fingerprint, service_user_id, is_active).
Only one config has is_active=True at a time.

Step 3 — Backend service (`argus/backend/service/tunnel_service.py`)

TunnelService with the following methods:

Method	Purpose
`register_tunnel(user, public_key, tunnel_id?, ttl_seconds?)`	Validate + fingerprint key, store `SSHTunnelKey` with TTL, return proxy connection dict + `expires_at`
`get_authorized_keys(tunnel_id?)`	Return all non-expired keys as OpenSSH `authorized_keys` text — called by proxy host via `AuthorizedKeysCommand`
`list_keys(tunnel_id?)`	Key inventory (used by admin panel in a later PR)
`delete_key(key_id)`	Immediate revocation
`get_proxy_tunnel_config(tunnel_id?)`	Read active or specific config
`save_proxy_tunnel_config(payload)`	Create new config, deactivate the old active one, create a dedicated service user (`proxy-tunnel-<host>`) with a fresh API token for the proxy host

TunnelServiceException covers all expected business errors (no active config, invalid key, missing fields, not found).

Step 4a — Client API routes (`argus/backend/controller/ssh_api.py`)

New ssh_api Blueprint registered as a sub-blueprint of client_api → final URLs under /api/v1/client/ssh/:

POST /api/v1/client/ssh/tunnel    @api_login_required

Register a public key, receive proxy connection details.
Request body: { "public_key": "ssh-ed25519 ...", "ttl_seconds": 86400, "tunnel_id": "<uuid>" }
Response: { "status": "ok", "response": { "key_id", "tunnel_id", "proxy_host", "proxy_port", "proxy_user", "target_host", "target_port", "host_key_fingerprint", "expires_at" } }

GET /api/v1/client/ssh/keys    @api_login_required

Returns all non-expired public keys as text/plain (authorized_keys format, one key per line).
Optional query param: ?tunnel_id=<uuid> to scope to a specific proxy host.
Called by the proxy host's AuthorizedKeysCommand via argus-cli ssh-keys (Step 7b).

Tests (`argus/backend/tests/tunnel/`)

All tests are @pytest.mark.docker_required and use the shared ScyllaDB Docker fixture from conftest.py.

test_tunnel_service.py — 14 unit tests:

register_tunnel: happy path, custom TTL, no active config raises, invalid key, missing key, explicit tunnel_id
get_authorized_keys: OpenSSH format validation, tunnel scoping
delete_key: row removed, nonexistent raises
save_proxy_tunnel_config: service user created, old config deactivated, missing fields raises
get_proxy_tunnel_config: returns active, by id, None when none active

test_ssh_api.py — 10 integration tests via Flask test client:

POST /ssh/tunnel: success, ttl_seconds, explicit tunnel_id, missing key, invalid key, malformed UUID (400), unauthenticated (403)
GET /ssh/keys: success + key in output, tunnel scoping, malformed UUID (400), unauthenticated (403), empty when no keys

No new dependencies

cryptography is already a transitive dependency via PyJWT[crypto].

How to test

# Run all tunnel tests (Docker required)
uv run pytest argus/backend/tests/tunnel/ -m docker_required -v

To exercise the API manually after the server is running:

# Register a key as a client:
curl -X POST http://localhost:5000/api/v1/client/ssh/tunnel \
  -H "Authorization: token <your_token>" \
  -H "Content-Type: application/json" \
  -d '{"public_key": "ssh-ed25519 AAAA..."}'

# Fetch authorized keys (as the proxy host would via AuthorizedKeysCommand):
curl http://localhost:5000/api/v1/client/ssh/keys \
  -H "Authorization: token <proxy_service_user_token>"

Note: a ProxyTunnelConfig row must exist with is_active=True before POST /ssh/tunnel will work. That can be inserted directly via a Python shell until Step 4b (admin API) lands.

What is NOT in this PR (follow-up)

Step	Description
4b	Admin API endpoints for proxy tunnel config + key management
4c	Proxy host provisioning Jinja template
4d	Admin Panel UI — `ProxyTunnelManager.svelte`
5/6	`argus-client` tunnel module + `base.py` integration
7/7b	CLI `--use-tunnel` flag + `argus-cli ssh-keys` command

Signed-off-by: Dusan Malusev <dusan@dusanmalusev.dev>

Removing due no subscription anyomre.

Add design document for routing Argus client traffic through an SSH tunnel via a proxy host, avoiding Cloudflare HTTPS costs. Covers client-generated ephemeral keys with ScyllaDB TTL, real-time key lookup via sshd AuthorizedKeysCommand, and automated proxy host provisioning from the admin panel. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ents/nemeses commands - Return an actionable 'unauthorized' error when the server responds with a non-JSON Content-Type (e.g. an HTML Cloudflare login page) instead of a confusing decode error, allowing LLMs and humans to self-correct - Make --type optional on 'argus run get': the plugin type is now resolved automatically via GET /run/<run_id>/type and cached with a 24-hour TTL (run type is immutable), so subsequent calls skip the network round-trip - Add 'argus run details' command: returns only the basic information shown on the Argus Details tab (no logs, screenshots, events, nemeses, resources or histograms); heavy data is accessible via dedicated subcommands - Add 'argus run events' command: fetches CRITICAL and ERROR events directly from GET /client/sct/<run_id>/events/<severity>/get with --before, --after, and --limit flags; time flags accept Unix timestamps, RFC3339, or YYYY-MM-DD - Add 'argus run nemeses' command: fetches nemesis records from the new GET /client/sct/<run_id>/nemesis/get endpoint with --before/--after filters - Backend: add GET /client/sct/<run_id>/nemesis/get endpoint and SCTService.get_nemeses() with before/after timestamp filtering - Backend: add 'after' query parameter support to both events endpoints (GET /events/get and GET /events/<severity>/get) and propagate it through SCTService.get_events() and SCTTestRun.get_events_limited()

…tive flag After CF login, immediately exchange the session for a PAT via GET /api/v1/user/token and store it as the primary credential, discarding the session. This makes auth more robust in CI and across CF token expiry. - ArgusService.Login: PAT-first fast-path; always exchanges session → PAT - ArgusService.fetchPAT: new helper calling the user-token endpoint - ErrFetchingToken / ErrStoringPAT: new sentinel errors - UserToken route and UserTokenResponse model added - root.go: extract buildAPIClientRaw; add --non-interactive persistent flag stored on context via cmdctx.WithNonInteractive - RunWithAuthRetry: wraps command RunE; on unauthorized error either returns ErrUnauthorized (--non-interactive) or re-auths and retries once - Auth tests updated to verify PAT is stored and session deleted post-login

…e; RunDetails view - run get: SCTTestRun now renders as a structured two-section Details table (Run Details + System Information + Summary) mirroring the Argus web UI Details tab; events and nemesis rows show counts only, keeping AI context windows small. All other run types keep the generic KV table. - models: add SCTEventsResponse (Tabular, caching-aware), NemesisResponse (extracted from run object, no extra API call), and RunDetails wrapper. Retain SCTRunDetails/GenericRunDetails/DriverRunDetails/SirenadaRunDetails and NemesisRecord for the run details and nemeses subcommands. - cache/keys: rename TTLEvents→TTLSCTEvents (5 min, matching run TTL); add TTLNemesis alias; rename EventsKey→SCTEventsKey (takes severity + before-cursor so each pagination page is cached independently under sct-events/{runID}/{severity}/{cursor}); add NemesisKey alongside the existing NemesesKey so both endpoint-based and run-object-derived paths have their own cache namespace. - api/routes: fix SCT event route prefix from /client/sct/ to /sct/ (matches actual Flask blueprint mount point); add SCTEventsCountBySeverity and SCTNemesisGet; keep SCTEventsBySeverity name used by run_nemeses_events.

- events/nemeses commands now follow a 4-step cache-first strategy: full-dataset cache hit → filter locally (no network); exact filtered cache hit → return directly; cache miss → fetch from API; store under full key (unfiltered) or filtered key (filtered) - cache/keys: SCTEventsKey now encodes both before+after cursors; add SCTEventsFullKey, NemesesFilteredKey, and update NemesesKey to store under a 'full' sentinel sub-path - fix all 19 linter errors (errcheck, staticcheck, unused): propagate fmt.Fprint* errors in cache.go; extract extractTarFile helper in logs.go to capture close errors; replace bare defer resp.Body.Close() with draining defers in api, auth, and cloudflared; fix Stats() nil-before-deref in cache.go; remove unused isCacheMiss and getCFToken; fix QF1011 in testid_test.go

- Suppress CF JWT from terminal output; only browser-URL lines are printed - Add ErrPATNotFound sentinel and DeletePAT(); guard against empty PAT in LoadPAT() - Add session fast-path in Login() to recover from failed PAT exchanges - Pass CF token alongside session cookie in fetchPAT() for CF Access passthrough - Add jwt.IsOlderThan() with iat-based age check; enforce 12h max CF token age - Add ErrUnauthorized sentinel to api.DoJSON; use errors.Is in isUnauthorizedErr() - Fix Rows() slice aliasing bug in RunDetails (out := rows[:0] → make) - Unify SCT run details path: use RunDetails{Run: full}, delete SCTRunDetails - Remove unused NemesisKey, TTLNemeses, TTLNemesis from cache/keys.go - Rewrite logging: JSON file + opt-in text console with independent level filters - Add -v/-vv/-vvv count flag wiring WithConsoleWriter to cmd.ErrOrStderr() - Print success message to cmd.OutOrStdout() after argus auth completes - Update logging tests: JSON file assertions, new console writer coverage

Explicitly discard return values from fmt.Fprintln/Fprintf calls and wrap deferred resp.Body.Close() in anonymous funcs to satisfy errcheck. Replace redundant runtime type assertion with idiomatic compile-time interface check to fix staticcheck S1040.

… silence usage on runtime errors - Fix log download route: /testrun/tests/... -> /api/v1/tests/... - Fix SCT events routes: add missing /client prefix to match actual Flask blueprint mount path - Fix run nemeses: remove reference to non-existent GET endpoint; extract NemesisData from full run response instead - Remove dead NemesisRecord type and SCTNemesisGet route constant; add NewNemesisResponse constructor - run get now uses generic KVTabular full dump; run details keeps the curated sectioned view - RunDetails.MarshalJSON emits only the fields shown in the text table (runDetailsJSON) for consistent JSON/text output - Suppress usage output on runtime errors (API failures) by setting cmd.SilenceUsage=true at the start of every RunE; flag-parse and required-flag errors still show usage as before

…m run_details switch Move the per-run-type details projection logic out of the inline switch in run_details.go into a RunTypeDetailsHandlers registry and a DispatchDetails helper in testrun.go, keeping it alongside the existing RunTypeHandlers. Adding support for a new plugin now only requires a single map entry in each registry. The run details command body shrinks from 47 lines to 3.

run get now shows the lightweight details summary via DispatchDetails, while run details fetches the full run object via RunTypeHandlers with KVTabular output and caching.

… handlers All run, log, nemesis, event, comment and discussion commands now emit scoped zerolog entries at the appropriate level: - Debug: entry-point with input flags/IDs, cache hit/miss, route, counts - Info: one success summary per command with outcome fields - Warn: non-fatal cache write failures that don't abort the operation - Error: every error-return site, with full context fields Also fix a latent bug in auth_token.go where log.Err(nil) was a zerolog no-op; replaced with log.Error() so the message is actually emitted.

…t 401/403 in DoJSON Add SkipAuthRetryAnnotation to cache clear, cache info, and auth-token commands so they are excluded from the transparent re-authentication wrapper. Teach DoJSON to recognise HTTP 401 and 403 responses as ErrUnauthorized before attempting JSON decoding, enabling the auth retry logic to trigger on explicit server rejections.

Replace the single CF token keychain entry with three headless CF Access entries: cf-access-client-id, cf-access-client-secret, and cf-access-argus-token. Add StoreCFAccess, LoadCFAccess, HasCFAccess, and DeleteCFAccess functions so the CLI can persist and retrieve service-token credentials that bypass the cloudflared browser-based login flow. Includes tests for round-trip, partial storage, deletion, and HasCFAccess.

Add 'argus auth headless' which interactively prompts for three secrets (CF Access Client ID, CF Access Client Secret, Argus API Token) with masked input via golang.org/x/term and stores them in the OS keychain. This enables authentication without cloudflared or a browser. Add 'argus auth logout' which removes all stored credentials from the keychain: PAT, session cookie, and headless CF Access credentials. Both commands are registered as subcommands of 'argus auth' and carry SkipAuthRetryAnnotation.

Add Set, Get, GetAll, Keys, and IsValidKey functions to the config package so individual settings can be read and written to the config file on disk without disturbing other keys. Add 'argus config list', 'argus config get <key>', and 'argus config set <key> <value>' commands with shell completion for recognised keys (url, use_cloudflare). The auth headless command now automatically sets use_cloudflare=false in the config file after storing headless credentials.

…keychain probing Replace all keychain.HasCFAccess() / keychain.LoadCFAccess() decision points with cfg.UseCf so the auth mode is driven by the use_cloudflare config flag rather than the presence of keychain entries. When use_cloudflare=false (headless mode): - buildAPIClientRaw loads CF Access headers + Argus token from keychain - runWithAuthRetry reports an actionable error instead of launching cloudflared - auth command directs the user to 'auth headless' or 'config set' When use_cloudflare=true (cloudflared mode): - buildAPIClientRaw loads PAT or session from keychain - runWithAuthRetry triggers the full cloudflared browser login - auth command verifies existing credentials before re-authenticating Remove short-circuit early-returns from ArgusService.Login() so callers verify their credentials first and only call Login() when re-auth is needed. Update the corresponding test. This lets users keep both sets of credentials in the keychain and switch between modes with 'argus config set use_cloudflare true/false'.

…dflared mode

… repeated_at Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…(Part 3, Steps 1-3 + 4a) Implements the server-side foundation required for the SSH tunnel feature described in docs/plans/ssh-tunnel-design.md. This is the first stage that must land before any client-side or proxy-host work can be tested end-to-end. ## What is included ### DB models (Step 1 + Step 2) - argus/backend/models/ssh_key.py - SSHTunnelKey: stores a client-registered ed25519 public key scoped to a specific (user, tunnel) pair. Rows carry a ScyllaDB TTL (default 24 h) so expired keys are auto-deleted with no manual cleanup. expires_at is stored as an informational timestamp so clients know when to re-register. - ProxyTunnelConfig: stores the connection details of a proxy tunnel server (host, port, proxy_user, target_host, target_port, host_key_fingerprint, service_user_id, is_active). Only one config has is_active=True at a time. - argus/backend/models/web.py: both models added to USED_MODELS so they are created automatically by sync_models / at test startup. ### Backend service (Step 3) - argus/backend/service/tunnel_service.py — TunnelService class: - register_tunnel(user, public_key, tunnel_id?, ttl_seconds?): validates and fingerprints the submitted public key (SHA256 derived server-side via the cryptography library — private key never touches the server), stores an SSHTunnelKey row with ScyllaDB TTL, returns the full proxy connection dict including expires_at (UTC ISO-8601). - get_authorized_keys(tunnel_id?): returns all non-expired public keys as a newline-separated OpenSSH authorized_keys string. Called by the proxy host via AuthorizedKeysCommand → argus-cli ssh-keys. - list_keys(tunnel_id?) / delete_key(key_id): key inventory and revocation. - get_proxy_tunnel_config(tunnel_id?) / save_proxy_tunnel_config(payload): config CRUD. save_ deactivates the previous active config and creates a dedicated Argus service user (proxy-tunnel-<host>) with a fresh API token that is returned once to the admin for proxy-host provisioning. - TunnelServiceException for all expected business errors. ### Client API routes (Step 4a) - argus/backend/controller/ssh_api.py — Blueprint registered at /ssh: - POST /ssh/tunnel @api_login_required: register a public key, get proxy config back. Accepts optional ttl_seconds and tunnel_id. - GET /ssh/keys @api_login_required: return authorized_keys text. Accepts optional ?tunnel_id= query param. - argus/backend/controller/client_api.py: ssh_api blueprint registered as a sub-blueprint → final URLs are /api/v1/client/ssh/tunnel and /api/v1/client/ssh/keys. ### Tests - argus/backend/tests/tunnel/test_tunnel_service.py: 14 docker_required tests covering register_tunnel (happy path, custom TTL, no active config, invalid key, explicit tunnel_id), get_authorized_keys (format + tunnel scoping), delete_key, save_proxy_tunnel_config (service user creation, old config deactivation, missing fields), get_proxy_tunnel_config. - argus/backend/tests/tunnel/test_ssh_api.py: 10 docker_required integration tests via the Flask test client covering both routes (success, ttl, explicit tunnel_id, missing/invalid key, malformed UUID, unauthenticated access, tunnel scoping, empty response). ## No new dependencies cryptography is already a transitive dependency via PyJWT[crypto]. ## What is NOT included (follow-up PRs) - Step 4b: admin API endpoints (proxy tunnel config + key list/delete) - Step 4c: proxy host provisioning Jinja template - Step 4d: Admin Panel UI (ProxyTunnelManager.svelte) - Step 5/6: argus-client tunnel module and base.py integration - Step 7b: argus-cli ssh-keys command ## How to run the tests uv run pytest argus/backend/tests/tunnel/ -m docker_required -v

CodeLieutenant and others added 21 commits April 7, 2026 13:24

feature(cli): Add GitHub Actions workflow for linting and testing

4afd123

Signed-off-by: Dusan Malusev <dusan@dusanmalusev.dev>

fix: remove claude

29ee982

Removing due no subscription anyomre.

refactor(cli): swap run get and run details command implementations

2220791

run get now shows the lightweight details summary via DispatchDetails, while run details fetches the full run object via RunTypeHandlers with KVTabular output and caching.

fix(cli): attach cached Cloudflare Access JWT to API requests in clou…

9f59de6

…dflared mode

feat(cli): deduplicate events and aggregate duplicate timestamps into…

ed7f993

… repeated_at Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions Bot added the ai-assisted AI-assisted contribution label Apr 15, 2026

CodeLieutenant closed this Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature(ssh-tunnel): backend service and client API routes (Part 3 / Steps 1-3 + 4a)#1

feature(ssh-tunnel): backend service and client API routes (Part 3 / Steps 1-3 + 4a)#1
CodeLieutenant wants to merge 21 commits intomasterfrom
feature/ssh-tunnel-backend-part3

CodeLieutenant commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CodeLieutenant commented Apr 15, 2026

Context

What's in this PR

Step 1 + 2 — DB models (argus/backend/models/ssh_key.py)

Step 3 — Backend service (argus/backend/service/tunnel_service.py)

Step 4a — Client API routes (argus/backend/controller/ssh_api.py)

Tests (argus/backend/tests/tunnel/)

No new dependencies

How to test

What is NOT in this PR (follow-up)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Step 1 + 2 — DB models (`argus/backend/models/ssh_key.py`)

Step 3 — Backend service (`argus/backend/service/tunnel_service.py`)

Step 4a — Client API routes (`argus/backend/controller/ssh_api.py`)

Tests (`argus/backend/tests/tunnel/`)