noc-agent is the operator-facing investigation service for Hyrule Networks
(AS215932). It accepts monitoring events, runs structured incident analysis,
records human-review proposals, and keeps a fallback local control plane available
even when chat tooling is unreachable.
- FastAPI receives Alertmanager and Icinga webhooks.
- A LangGraph investigation runtime normalizes the alert, correlates repeated incidents, routes to a specialist profile, validates confidence, checks golden-state drift, and produces a reviewable proposal.
- Redis stores graph checkpoint state plus short-lived incident memory.
- Discord can act as the interactive operator console.
- A loopback-only local control API plus
nocctlgives operators an SSH/VPN fallback for review and decision recording. - Hyrule MCP provides live diagnostic telemetry; NOC Agent consumes it through the configured daemon URL or the legacy stdio path.
The current tranche is intentionally diagnostic-only. Approval records and resumes operator state, but it does not execute infrastructure changes.
Existing interfaces preserved:
POST /webhook/alertmanagerPOST /webhook/icingaPOST /taskPOST /mail/pollGET /healthGET /health/mcpGET /health/configGET /health/modelGET /health/mailGET /metrics
New control-plane interfaces:
GET /control/incidents/pendingGET /control/incidents/{incident_id}POST /control/incidents/{incident_id}/decisionPOST /approval/resume
The /control/... endpoints require X-NOC-Control-Token. The signed resume
endpoint requires an HMAC signature using NOC_APPROVAL_SIGNING_SECRET.
nocctl is the local fallback interface:
nocctl pending
nocctl show <incident-id>
nocctl decide <incident-id> approved --operator svag --comment "reviewed"In production this is intended to run on noc over existing SSH/VPN access,
with NOC_CONTROL_URL=http://127.0.0.1:8000.
When DISCORD_BOT_TOKEN is present, the service starts a discord.py bot that
supports:
- slash-command investigations
- pending/status lookups
- approve/reject/acknowledge decisions
- mention-driven investigations
Guild, channel, and role allowlists are configured with:
DISCORD_ALLOWED_GUILD_IDSDISCORD_ALLOWED_CHANNEL_IDSDISCORD_ALLOWED_ROLE_IDS
The supervisor prompt is assembled from:
app/prompts/supervisor_context.mdapp/prompts/golden_state_manifest.json
The manifest is the machine-readable intended-state anchor. Live MCP telemetry is compared against it during investigation so proposals can call out drift instead of inventing a configuration story.
NOC_REDIS_URLHYRULE_MCP_URLNOC_CONTROL_TOKENNOC_APPROVAL_SIGNING_SECRETDISCORD_BOT_TOKENDISCORD_ALLOWED_GUILD_IDSDISCORD_ALLOWED_CHANNEL_IDSDISCORD_ALLOWED_ROLE_IDSOPENROUTER_API_KEYfor the default model backend- Optional
OPENROUTER_MANAGEMENT_API_KEYfor account-wide credit monitoring - Optional
OPENROUTER_APP_TITLEandOPENROUTER_APP_URLfor OpenRouter attribution
The legacy HYRULE_MCP_CMD path remains accepted for compatibility.
Model selection is configurable via TOML. Lookup order is:
NOC_AGENT_CONFIG/etc/noc-agent/config.tomlconfig/noc-agent.toml- built-in defaults
The default chain is OpenRouter DeepSeek V4 Pro with Claude Sonnet 4.6 as a fallback:
[model]
primary = "openrouter:deepseek/deepseek-v4-pro"
fallbacks = ["openrouter:anthropic/claude-sonnet-4.6"]Any OpenRouter model can be selected with openrouter:<model-slug>. Secrets stay in environment variables, not in the config file. AGENT_MODEL and AGENT_FALLBACK_MODELS still override the config file for emergency changes.
Google/Gemini remains supported for future use:
[model]
primary = "google-gla:gemini-3.1-pro"
fallbacks = []For the new default backend, migrate from GEMINI_API_KEY to OPENROUTER_API_KEY. /health/model reports active models plus OpenRouter key-limit and usage status. /metrics exports OpenRouter credit/usage gauges.
See TESTING.md.
Part of Hyrule Networks (AS215932).