Skip to content

feat: auto retry with exponential backoff for exec/read/write#75

Merged
blaspat merged 1 commit into
mainfrom
feat/auto-retry
Jun 23, 2026
Merged

feat: auto retry with exponential backoff for exec/read/write#75
blaspat merged 1 commit into
mainfrom
feat/auto-retry

Conversation

@blaspat

@blaspat blaspat commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Problem: When a node disconnects briefly (network blip, restart), calls to node_exec/node_read/node_write fail immediately. The agent has to manually retry.\n\nSolution: Automatic retry with exponential backoff:\n\n- 3 retries by default (configurable via max_retries in config)\n- Exponential backoff: 2s, 4s, 8s... (capped at 30s)\n- Retries on: connection refused, 5xx, "node not connected" responses\n- Configurable via HERMES_NODES_MAX_RETRIES and HERMES_NODES_RETRY_BACKOFF_SECONDS env vars\n\nFiles changed:\n- config.py — added max_retries and retry_backoff_seconds to NodeServerConfig\n- tools.py — added _request_with_retry helper with exponential backoff; all three tools now use it

- Added max_retries (default: 3) and retry_backoff_seconds (default: 2.0)
  to NodeServerConfig for configurable retry behavior
- Added _request_with_retry helper with exponential backoff
  (backoff * 2^attempt, capped at 30s)
- Retries on connection errors, 5xx, and 'node not connected' responses
- Updated _node_exec_impl, _node_read_impl, _node_write_impl to use retry

Signed-off-by: Blasius Patrick <blasius.patrick@gmail.com>
@blaspat blaspat merged commit 8e205d9 into main Jun 23, 2026
1 check passed
@blaspat blaspat deleted the feat/auto-retry branch June 23, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant