fix(2495): Karen humanizes EDS list output before showing to user by deepmasq · Pull Request #360 · smallcloudai/flexus-client-kit

deepmasq · 2026-04-28T09:15:39Z

Fixes Fibery #2495 — Karen was relaying raw EDS tool output (JSON dumps, markdown tables with eds_id, eds_type, eds_scan_problem, raw timestamps) verbatim to users. Katrin: "lots of unclear letters & symbols, not user-friendly."

Diagnosis

flexus_eds_setup(op="list") and list_eds_str() in flexus backend return technical-style strings:

JSON dump: `{"eds_id":"...","eds_name":"...","eds_type":"unstructured_google_drive"}`
Markdown table with raw column names

Karen had no instruction telling her to rephrase before showing.

Change

One rule in `KAREN_PERSONALITY` (concatenated into 4 user-facing experts: `default`, `very_limited`, `messages_triage`, `explore` — not `post_conversation`):

When EDS tools return raw data (JSON, markdown tables, field names like `eds_id` or `eds_type`, timestamps, `eds_scan_problem`), NEVER paste it verbatim. Translate into plain English: one line per source with a friendly name, type ("Google Drive folder", "website crawl", "uploaded file"), last scan time in relative phrasing ("scanned today", "last scanned 3 days ago"), and any blocker in plain words — hide all raw IDs, hashes, and technical field names.

Net: +5 / 0 in `karen_prompts.py`, plus VERSION bump.

Tournament context

Tournament was N=2: A (this branch, prompt-only) + B (backend formatter rewrite in `flexus` repo). B's worktree crashed mid-run with an API error and never pushed. Promoted A directly. Backend formatter remains as a candidate follow-up if prompt fix proves insufficient.

Honest doubts

Karen may still occasionally leak raw EDS strings if the rule's weight loses against the imperative phrasing in EDS skills. If so, follow up with backend formatter (Candidate B's lane).
Verification pending — staging deploy + chat probe in next step.

🤖 Single-candidate tournament outcome.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

deepmasq · 2026-04-28T09:37:46Z

Staging verification — PASS

Deployed dev build `1.2.271` to staging marketplace, installed on `scen-bench-01` (persona `0i0iTV6ZWU`). Seeded two EDS sources to exercise both happy and error paths:

`L3FSVgdyqG` — Web Crawler "Flexus Help Center", scan succeeded
`JcXcbp7x1K` — Google Drive "Sales Playbook", auth error path

Raw tool output (the bug shape) Karen received

Karen's user-facing reply

You have these knowledge sources available:

Sales Playbook (Google Drive folder): Tried scanning 2 minutes ago, but auth error prevents access—never successfully scanned.

Flexus Help Center (website crawl): Scanned successfully 2 minutes ago.

Pass criteria

Check	Result
Plain English prose	✓
Friendly type labels ("Google Drive folder", "website crawl")	✓
Relative time ("2 minutes ago")	✓
Blocker in plain words ("auth error prevents access")	✓
No `eds_id` / raw IDs	✓
No `eds_type=unstructured_google_drive` / field names	✓
No JSON braces / markdown table	✓
Karen still uses EDS info (no over-correction)	✓

Side-effects

None observed. Karen ran two defensive extra tool calls (`kanban_advanced(op=status_my_tasks)` and `eds_setup(op=help)`) but they didn't pollute the user reply. ~7s reply latency.

Screenshots: `flexus-test-agents/screenshots/adhoc/2495-test-{1..5}.png`.

Ready to merge.

deepmasq · 2026-04-28T09:55:51Z

Round 2 verification — PASS-with-caveat

Re-tested with broader scenarios. Three scenarios run on staging dev install `1.2.271` / persona `0i0iTV6ZWU`:

#	Scenario	Verdict
A	Empty list ("you have no sources yet")	PASS — friendly empty-state, no `[]` or JSON
B	5 mixed-type sources (crawler + gdrive + uploaded file)	PASS — all types translated, no IDs/types/timestamps
D	Full KB collection skill end-to-end	FAIL (minor) — Karen leaked raw EDS IDs in a copy-paste suggestion: `set your knowledge EDS? E.g., "dPsJ25YsNr,NmwZ56m6fB,RbFmhgy19W,a9zPZJIkAF"`

#2515 regression check

Clean — no `Status: support_collection_status()` leaks in any of A/B/D. Note: this branch does NOT contain #2515's prompt rule yet (branched off main pre-#359). When #359 merges, expect a textual conflict in the same KAREN_PERSONALITY insertion point.

Caveat (Scenario D)

The new prompt rule says "hide all raw IDs". It works for the natural list-EDS reply path (A, B). It loses against the skill's "set knowledge EDS" instruction in multi-step orchestrated flows — model treats IDs as actionable parameters and quotes them for copy-paste. Possible follow-up: tighten rule wording, or co-locate it next to skill SKILL.md instructions that mention EDS IDs. Tracking as scope of follow-up rather than blocking this PR.

Pre-merge

Rebase on top of fix(2515): stop Karen narrating tool calls in chat replies #359 once that lands (textual conflict in KAREN_PERSONALITY).
Optionally fold the Scenario-D fix into a successor PR.

Round 2 screenshots: `flexus-test-agents/screenshots/adhoc/2495-r2-{A-persona-page,A-empty-list,A-empty-thread,B-many-sources,D-kb-collection}.png`.

humbertoyusta · 2026-04-28T15:08:39Z

 but it is not useful for a regular user who asks a question, so do not mention it.

+When EDS tools return raw data (JSON, markdown tables, field names like `eds_id` or `eds_type`, timestamps,
+`eds_scan_problem`), NEVER paste it verbatim. Translate into plain English: one line per source with a friendly name,


don't directly mention English, user may be talking in diff language

humbertoyusta · 2026-04-28T15:11:01Z


+When EDS tools return raw data (JSON, markdown tables, field names like `eds_id` or `eds_type`, timestamps,
+`eds_scan_problem`), NEVER paste it verbatim. Translate into plain English: one line per source with a friendly name,
+type ("Google Drive folder", "website crawl", "uploaded file"), last scan time in relative phrasing ("scanned today",


last scan time, prompt is maybe too specific here, could be something more generic like don't print timestamps, use relative human readable time

humbertoyusta · 2026-04-28T15:12:06Z

+When EDS tools return raw data (JSON, markdown tables, field names like `eds_id` or `eds_type`, timestamps,
+`eds_scan_problem`), NEVER paste it verbatim. Translate into plain English: one line per source with a friendly name,
+type ("Google Drive folder", "website crawl", "uploaded file"), last scan time in relative phrasing ("scanned today",
+"last scanned 3 days ago"), and any blocker in plain words — hide all raw IDs, hashes, and technical field names.


we may not need 4 lines of prompt for this, instruction can be a bit shorter

eds: humanize EDS list output in karen style section

0c0762a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

humbertoyusta reviewed Apr 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(2495): Karen humanizes EDS list output before showing to user#360

fix(2495): Karen humanizes EDS list output before showing to user#360
deepmasq wants to merge 1 commit intomainfrom
fix/2495-eds-message-prompt

deepmasq commented Apr 28, 2026

Uh oh!

deepmasq commented Apr 28, 2026

Uh oh!

deepmasq commented Apr 28, 2026

Uh oh!

humbertoyusta Apr 28, 2026

Uh oh!

humbertoyusta Apr 28, 2026

Uh oh!

humbertoyusta Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

deepmasq commented Apr 28, 2026

Diagnosis

Change

Tournament context

Honest doubts

Uh oh!

deepmasq commented Apr 28, 2026

Staging verification — PASS

Raw tool output (the bug shape) Karen received

Karen's user-facing reply

Pass criteria

Side-effects

Uh oh!

deepmasq commented Apr 28, 2026

Round 2 verification — PASS-with-caveat

#2515 regression check

Caveat (Scenario D)

Pre-merge

Uh oh!

humbertoyusta Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

humbertoyusta Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

humbertoyusta Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants