Skip to content

fix(deploy): point harness probe + existing-persona lookup at real cloud routes#119

Merged
khaliqgant merged 1 commit into
mainfrom
fix-cli/harness-probe-cloud-agents
May 13, 2026
Merged

fix(deploy): point harness probe + existing-persona lookup at real cloud routes#119
khaliqgant merged 1 commit into
mainfrom
fix-cli/harness-probe-cloud-agents

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented May 13, 2026

Summary

Two phantom-endpoint bugs were blocking every agentworkforce deploy against production cloud. Both fall under the deploy-v1 "M3" rollout — workforce was wired to API routes that were never built on cloud, and the real routes have different paths/shapes.

Bug Old workforce call What it actually hits Replacement
Harness check GET /users/me/provider_credentials?model_provider=<provider> 404 — route never built GET /api/v1/cloud-agents, match by harness + status === 'connected'
Existing-persona check GET /workspaces/{ws}/agents?persona_slug=<slug> 403 — that's a dashboard proxy that needs session cookie GET /workspaces/{ws}/deployments, filter client-side

Why

Harness probe (1st commit)

User had Anthropic connected via agent-relay cloud connect anthropic (status:"connected", credentialStoredAt:"2026-05-13T14:17:40.048Z" in cloud_agents). But every deploy with --no-prompt failed with "credentials are not connected" because the probe URL 404'd → isHarnessOauthConnected returned false → --no-prompt short-circuited to the actionable-but-misleading error.

Existing-persona check (2nd commit)

agents.personaId on the cloud schema is a UUID FK to personas.id. The CLI only knows the persona's slug (persona.id in the JSON). Two interacting problems:

  • /workspaces/{ws}/agents is a dashboard proxy that requires session auth → 403 for cli:auth Bearer.
  • The real listing endpoint /workspaces/{ws}/deployments (cloud#580) accepts cli:auth but its ?personaId= filter expects a UUID. Sending a slug → drizzle UUID-cast → 500.

The fix is to fetch the workspace's full deployments list (bounded — dozens of rows in practice) and match client-side against deployedName (which cloud derives from persona.slug || persona.name || persona.id), with personaSlug and personaId as fallbacks.

Changes

Harness probe (1st commit):

  • Replace ProviderCredentialsResponse+providerCredentialsReady with CloudAgentsListResponse+hasConnectedHarness(body, expectedHarness).
  • isHarnessOauthConnected now hits /api/v1/cloud-agents and matches by harness (case-insensitive) + status === 'connected'.

Existing-persona check (2nd commit):

  • findExistingAgent now calls /deployments (no server-side filter — see above).
  • parseAgentLike accepts both {agentId, deployedName, personaId, status} (new) and {id, status} (legacy preview).
  • When expectedPersonaId is supplied: match against any persona-identifying field on the row; rows with no persona info pass through (legacy back-compat).
  • Treat status === 'destroyed' as "not present" so re-deploys don't trip on-exists against tombstones.

Tests

88 deploy tests pass. Combined coverage:

  • Harness probe: empty list → no-prompt error fires; matching connected entry → probe returns true, no connect-provider call; wrong harness or wrong status → false positives blocked; polling transitions correctly.
  • Existing-persona: new shape ({agents:[{agentId,...}]}) parsed correctly; tombstones + wrong-persona rows skipped; legacy {agent:{id}} still works; listing GET fires before deploy POST.

Many existing tests updated to disambiguate init.method === 'GET' (listing) from init.method === 'POST' (deploy) since both now share the same URL.

Smoke

Built locally and ran against https://agentrelay.com/cloud with the actual user workspace 50587328-...:

$ agentworkforce deploy ./.agentworkforce/notion-essay-pr/persona.json --mode cloud --no-prompt --harness-source oauth
workforce deploy → ...persona.json
persona notion-essay-pr: 2 integration(s), 0 schedule(s)
workspace: 50587328-441d-4acb-b8f3-dbe1b3c5de99
integrations.notion: already connected
integrations.github: already connected
bundle: staged to ...runner.mjs (14.5KB)
mode: cloud
cloud: claude credentials already connected        ← 1st commit
cloud: deploying persona bundle to https://...     ← 2nd commit (no 403 / 500)
(now hits a separate cloud-side packaging issue:
 "Persona validation is unavailable: @agentworkforce/persona-kit could not be loaded
  ... @parcel/watcher-linux-x64-glibc not found"
 — cloud Lambda is missing the native binary, separate fix on cloud side)

Scope notes (deferred)

  • Plan/byok save paths still POST to cloud routes that don't exist (/workspaces/{ws}/provider-credentials/managed and /byok). Need cloud-side implementation. Out of scope.
  • The ?personaId= filter on cloud's /deployments GET should accept slugs (currently UUID-only → 500 on slug input). Filed as a follow-up.
  • @parcel/watcher-linux-x64-glibc missing from cloud's Lambda bundle. Filed as a follow-up.

Test plan

  • pnpm -F @agentworkforce/deploy run lint clean
  • pnpm -F @agentworkforce/deploy run test — 88/88 pass
  • Smoke against agentrelay.com/cloud with real workspace — harness check passes, existing-persona check passes, deploy bundle uploads (fails further down on cloud's packaging issue, not this PR)

🤖 Generated with Claude Code

The harness-credentials check called
`GET /api/v1/users/me/provider_credentials?model_provider=<provider>`,
which doesn't exist on cloud at all — the route was never built. Every
`agentworkforce deploy --harness-source oauth --no-prompt` therefore
failed with "credentials are not connected" even when they were, and
the auto-detect path (no --harness-source) always fell through to the
interactive prompt for the same reason.

Cloud actually exposes harness connection state via `/api/v1/cloud-agents`,
which returns one row per (user, workspace, harness). When the OAuth
completion route (`/api/v1/cli/auth/complete`) stores a credential in
S3 it marks that row `status: "connected"`. That's the single source
of truth — no second probe needed.

Changes:
* Replace `ProviderCredentialsResponse`+`providerCredentialsReady` with
  `CloudAgentsListResponse`+`hasConnectedHarness`.
* Switch `isHarnessOauthConnected` to call `/api/v1/cloud-agents` and
  match by harness (case-insensitive) + `status === 'connected'`.
* Rewrite the two existing harness tests for the new endpoint, plus add
  two new cases: (1) a matching connected entry → probe returns true and
  no connect-provider call fires; (2) entries with wrong harness or
  wrong status are correctly ignored.

Plan/byok save paths still call cloud routes that don't exist either —
those need cloud-side implementation, deferred to a separate PR. Users
who already authed via `agent-relay cloud connect <provider>` (the
default flow today) are unblocked immediately by this change.

Smoke verified against `agentrelay.com/cloud` with the user's actual
Anthropic-connected workspace:
  cloud: claude credentials already connected
(deploy proceeds to bundle/upload; the next failure is a separate 403
on `/workspaces/{ws}/agents` — cloud auth scope bug, not this PR.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8a78c198-11d5-4873-8e7d-e4b0a72a209d

📥 Commits

Reviewing files that changed from the base of the PR and between 9a859fd and fa4ce25.

📒 Files selected for processing (2)
  • packages/deploy/src/modes/cloud.test.ts
  • packages/deploy/src/modes/cloud.ts

📝 Walkthrough

Walkthrough

OAuth harness connection readiness check migrated from probing provider credentials to cloud agents endpoint. Type contracts added, readiness predicate replaced, and test suite expanded to cover harness matching, connection status, and polling behavior.

Changes

Cloud Agents OAuth Readiness Migration

Layer / File(s) Summary
Cloud Agents Response Contracts
packages/deploy/src/modes/cloud.ts
TypeScript interfaces CloudAgentsListResponse and CloudAgentEntry model the /api/v1/cloud-agents endpoint payload, including harness identifier and status field.
OAuth Readiness Implementation
packages/deploy/src/modes/cloud.ts
isHarnessOauthConnected refactored to query /api/v1/cloud-agents, treating 404/405 as not connected. New hasConnectedHarness predicate scans agent entries for matching harness name and status: 'connected', replacing previous provider-credentials readiness logic. Unauthorized error message updated.
Cloud Agents OAuth Test Coverage
packages/deploy/src/modes/cloud.test.ts
New "no-prompt" test verifies /api/v1/cloud-agents probe call and failure when no agents present. Added tests for connected matching harness (deploy proceeds) and mismatched harness (treated not connected). OAuth polling test refactored to mock cloud-agents endpoint, incrementing poll count and returning connected agent after initial empty responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Cloud agents hop into view,
OAuth checks take a newer path too,
Harness matching with connected true,
From credentials to agents—a fresher brew! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: switching the harness probe from a non-existent endpoint to the real /api/v1/cloud-agents endpoint.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, providing context, motivation, implementation details, test coverage, and smoke test results.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-cli/harness-probe-cloud-agents

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@khaliqgant khaliqgant merged commit 6e44ff4 into main May 13, 2026
3 checks passed
@khaliqgant khaliqgant deleted the fix-cli/harness-probe-cloud-agents branch May 13, 2026 19:00
@khaliqgant khaliqgant changed the title fix(deploy): point harness probe at /api/v1/cloud-agents (M3) fix(deploy): point harness probe + existing-persona lookup at real cloud routes May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant