feat(sqs): --sqsFifoPartitionMap flag + parser + validator (Phase 3.D PR 4-A)#704
feat(sqs): --sqsFifoPartitionMap flag + parser + validator (Phase 3.D PR 4-A)#704
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces the sqsFifoPartitionMap flag and associated logic to support HT-FIFO partition assignments. It includes a new routing structure, parsing logic for the flag's grammar, and validation against configured Raft groups. Unit tests have been added to ensure correct parsing and error handling. Feedback focuses on using the more idiomatic map[string]struct{} for set-based membership checks in the validation logic and tests.
| groupsByName := make(map[string]string, len(groups)) | ||
| for _, g := range groups { | ||
| groupsByName[strconv.FormatUint(g.id, 10)] = g.address | ||
| } |
There was a problem hiding this comment.
The groupsByName map is only used for existence checks in validateSQSFifoPartitionMap. Using map[string]struct{} is more idiomatic and efficient as it clearly signals that only the keys matter.
| groupsByName := make(map[string]string, len(groups)) | |
| for _, g := range groups { | |
| groupsByName[strconv.FormatUint(g.id, 10)] = g.address | |
| } | |
| groupsByName := make(map[string]struct{}, len(groups)) | |
| for _, g := range groups { | |
| groupsByName[strconv.FormatUint(g.id, 10)] = struct{}{} | |
| } |
References
- In Go, maps used as sets should use struct{} as the value type to save memory and clearly communicate intent that only keys are relevant. (link)
| // raftGroupsByName is the {ID-as-string -> address} map produced by | ||
| // parseRaftGroups; the partition-map flag uses the same string IDs | ||
| // the operator supplied to --raftGroups so this lookup is direct. | ||
| func validateSQSFifoPartitionMap(m map[string]sqsFifoQueueRouting, raftGroupsByName map[string]string) error { |
There was a problem hiding this comment.
Update the signature to use map[string]struct{} to match the idiomatic membership check.
| func validateSQSFifoPartitionMap(m map[string]sqsFifoQueueRouting, raftGroupsByName map[string]string) error { | |
| func validateSQSFifoPartitionMap(m map[string]sqsFifoQueueRouting, raftGroupsByName map[string]struct{}) error { |
References
- Using map[string]struct{} is the idiomatic way to implement a set in Go. (link)
| m := map[string]sqsFifoQueueRouting{ | ||
| "q.fifo": {PartitionCount: 2, Groups: []string{"g0", "g1"}}, | ||
| } | ||
| groups := map[string]string{"g0": "127.0.0.1:1", "g1": "127.0.0.1:2"} |
There was a problem hiding this comment.
Update the test data to use map[string]struct{}.
| groups := map[string]string{"g0": "127.0.0.1:1", "g1": "127.0.0.1:2"} | |
| groups := map[string]struct{}{"g0": {}, "g1": {}} |
References
- Using map[string]struct{} is the idiomatic way to implement a set in Go. (link)
| groups := map[string]string{ | ||
| "g0": "a", "g1": "b", "g2": "c", "g3": "d", | ||
| } |
There was a problem hiding this comment.
Update the test data to use map[string]struct{}.
| groups := map[string]string{ | |
| "g0": "a", "g1": "b", "g2": "c", "g3": "d", | |
| } | |
| groups := map[string]struct{}{ | |
| "g0": {}, "g1": {}, "g2": {}, "g3": {}, | |
| } |
References
- Using map[string]struct{} is the idiomatic way to implement a set in Go. (link)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7c581f5a1d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
| groupsByName := make(map[string]string, len(groups)) | ||
| for _, g := range groups { | ||
| groupsByName[strconv.FormatUint(g.id, 10)] = g.address |
There was a problem hiding this comment.
Normalize group IDs consistently during map validation
buildSQSFifoPartitionMap keys groupsByName with strconv.FormatUint(g.id, 10), which canonicalizes IDs and drops any leading-zero formatting from --raftGroups. Because parseSQSFifoPartitionMap keeps group tokens as raw strings, a config like --raftGroups "01=a,02=b" with --sqsFifoPartitionMap "q.fifo:2=01,02" is rejected as “group not in --raftGroups” even though both groups exist. This introduces a startup-time false negative for valid operator input.
Useful? React with 👍 / 👎.
… PR 4-A) Adds the operator-config foundation for HT-FIFO routing per §5 of the split-queue FIFO design. Stacks on top of PR 3 (#703), which lands the partitioned-keyspace constructors. The grammar matches the design: 'queue.fifo:N=group_0,...,group_{N-1};...' with multiple queue entries separated by ';' (commas are reserved for the per-queue group list). PartitionCount must be a power of two ≤ 32 — the validator rejects anything else at config-parse time so a typo cannot land a half-shaped routing map at runtime. Two-stage validation: - parseSQSFifoPartitionMap: grammar + per-entry consistency (PartitionCount agrees with len(Groups), partition count is a power of two within the cap, no duplicate queues). - validateSQSFifoPartitionMap: runs after --raftGroups is parsed and asserts every group named in the partition map exists in --raftGroups; rejects with a queue/partition-pointer error message so the operator can fix typos without re-counting. Plumbed through parseRuntimeConfig into runtimeConfig.sqsFifoPartitionMap. Not yet consumed by routing — PR 4-B will wire it through ShardedCoordinator and add the htfifo capability advertisement once the routing layer lands. The §11 PR 2 dormancy gate still rejects PartitionCount > 1 on CreateQueue, so populating this flag has no effect on production traffic until PR 5 lifts the gate atomically with the data-plane fanout. parseSQSFifoPartitionMapEntry was extracted into parseSQSFifoPartitionCount + parseSQSFifoGroupList helpers to stay under the cyclop=10 ceiling once the four-rejection validation surface landed (zero / over-cap / non-power-of-2 / count-mismatch). parseRuntimeConfig grew over the cyclop ceiling too, so the partition-map block was extracted into buildSQSFifoPartitionMap. Tests in shard_config_test.go cover: - Empty input yields empty map. - Single-queue happy path. - Multiple queues separated by ';'. - Whitespace trimming. - PartitionCount > 0 enforcement. - PartitionCount must be a power of two. - PartitionCount within the per-queue cap. - count and group-list length must agree. - Malformed entry (missing '=', empty queue, empty group, trailing comma). - Duplicate queue rejection. - validateSQSFifoPartitionMap: all-groups-present passes; missing group fails with queue + partition-index pointer in the error message. go test -race ./... pass; golangci-lint ./... clean. No behaviour change in production paths.
Two review-driven fixes folded into one commit:
1) Codex P2 (group ID normalization)
parseRaftGroups runs each --raftGroups entry through strconv.ParseUint
and the validator keys its set off strconv.FormatUint(g.id, 10) — so
the canonical form drops leading zeros ("01" → "1"). The partition
map parser kept group tokens as raw strings, which made
--raftGroups "01=a,02=b" with --sqsFifoPartitionMap "q.fifo:2=01,02"
fail validation as group-not-found even though both groups exist.
parseSQSFifoGroupList now ParseUint+FormatUints each group reference
at parse time, which:
- rejects non-numeric group references early with a clear pointer
("group %q is not a uint64 ID") instead of letting them fall through
to a later group-not-found error,
- aligns the partition-map shape with parseRaftGroups so the validator
is a plain set-membership check.
2) Gemini medium (idiomatic set type)
validateSQSFifoPartitionMap took map[string]string but only used the
keys. Switch the parameter and the buildSQSFifoPartitionMap caller to
map[string]struct{} — the address half was unused, the new signature
makes set-only-keys intent visible, and the test data sites use the
same shape.
Tests:
- new sub-test "normalizes leading-zero group IDs" pins the codex case
- new sub-test "rejects non-uint64 group reference" pins the parse-time
rejection of alphabetic group names
- existing sub-tests reshaped to use numeric group IDs (consistent
with what parseRaftGroups actually accepts on the operator side)
- validator tests use map[string]struct{} for the canonical-ID set
7c581f5 to
050a9af
Compare
|
@claude review Round 2 — addresses gemini-code-assist + codex review:
New sub-tests pin the codex case ("normalizes leading-zero group IDs") and the parse-time rejection ("rejects non-uint64 group reference"). Existing sub-tests reshaped to use numeric group IDs (consistent with what parseRaftGroups actually accepts). Also rebased onto the latest tip of feat/sqs-htfifo-keyspace (PR 703) to pick up the partitioned-key delimiter fix from gemini's PR 703 review. |
|
Claude finished @bootjp's task in 2m 29s —— View job Review — Round 2
Both Gemini and Codex feedback items are correctly addressed. Overall this is a clean round 2 — no blocking issues. A few minor observations below. What's well doneGroup ID canonicalization ( Idiomatic set type ( Cyclop decomposition: Three helpers ( Test coverage: 13 sub-tests total; the two new ones ( Minor observations1. Awkward error message in "%q: PartitionCount %q is not a non-negative integer"
2. Exported fields on unexported struct ( type sqsFifoQueueRouting struct {
PartitionCount uint32
Groups []string
}
3. Non-deterministic queue order in for queue, routing := range m {Go map iteration is non-deterministic. For a config with two misconfigured queues, the error message references whichever queue Go picks on that run. This is cosmetic for startup validation (not a race condition, just which queue is reported first may vary), but if operator UX matters later, sorting the keys before iteration makes errors reproducible: queues := slices.Sorted(maps.Keys(m))
for _, queue := range queues {
routing := m[queue]
...Low priority for a startup-time validator, but worth noting. 4. Unreachable "empty queue name" branch for colon-at-position-zero input ( For SummaryBoth review items are properly addressed. The canonicalization logic is correct and well-tested, the set type is idiomatic, and the new test cases pin exactly the cases the reviewers raised. Items 1–4 above are all minor polish, not correctness concerns. No blocking issues. |
Addresses three minor observations from the round 2 Claude review. Each is informational rather than blocking, but they tighten the parser's UX and the package-internal style. 1) Reword PartitionCount-not-an-integer error before: "PartitionCount %q is not a non-negative integer" after: "PartitionCount %q must be a positive decimal integer" uint64 narrowing rules out negatives anyway, so the original phrasing led with the wrong constraint. The new phrasing matches what the parser actually requires. 2) Lowercase fields on the unexported sqsFifoQueueRouting PartitionCount → partitionCount Groups → groups The struct is package-internal; exported field names invite reflective access if it's ever surfaced and contradict the rest of shard_config.go's style (groupSpec, rangeSpec both use lowercase fields). All call sites in shard_config.go and shard_config_test.go updated. 3) Sort queue keys in validateSQSFifoPartitionMap iteration Go map iteration is randomised, so a config with two misconfigured queues reports a different queue on each run. Sorting via slices.Sorted(maps.Keys(m)) makes the validator's error reproducible for operator triage and for any future golden-test assert that pins the failure message.
|
@claude review Round 3 — minor polish from the round 2 review:
Item 4 from the review (the unreachable-path note about |
|
Claude finished @bootjp's task in 1m 4s —— View job Review — Round 3
All three items from the Round 2 review are correctly addressed. No blocking issues. What's well done1. Error wording ( 2. Lowercase fields (
3. Deterministic iteration ( 4. Import additions ( 5. e2e test ( One minor note (non-blocking)In the "trims whitespace" test, the outer whitespace in SummaryRound 3 is clean. All four Round 2 observations that warranted action are addressed. Item 4 (unreachable-path note) is intentionally left as-is per the reviewer's note — the sentinel is the same regardless of which branch fires, so behavior is correct. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5d7cb96ce1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| queue := strings.TrimSpace(entry[:colonIdx]) | ||
| if queue == "" { | ||
| return "", sqsFifoQueueRouting{}, errors.Wrapf(ErrInvalidSQSFifoPartitionMapEntry, | ||
| "%q: empty queue name", entry) | ||
| } |
There was a problem hiding this comment.
Enforce .fifo suffix in partition-map queue names
The parser documents the entry shape as queue.fifo:N=... but only checks for a non-empty queue token, so inputs like orders:2=1,2 are accepted as valid. This creates a silent misconfiguration path where a typo in the queue name passes startup validation even though it violates the advertised grammar, making operators think partition routing is configured when it is not. Please validate the queue token ends with .fifo in parseSQSFifoPartitionMapEntry.
Useful? React with 👍 / 👎.
Summary
Phase 3.D PR 4-A of the split-queue FIFO rollout. Operator-config foundation: parses
--sqsFifoPartitionMap, validates against--raftGroups, plumbs throughparseRuntimeConfigintoruntimeConfig.sqsFifoPartitionMap. Stacks on top of #703 (PR 3, partitioned-keyspace constructors).What's added
--sqsFifoPartitionMapflag with grammarqueue.fifo:N=group_0,...,group_{N-1};...parseSQSFifoPartitionMap+parseSQSFifoPartitionMapEntry(grammar + per-entry consistency)validateSQSFifoPartitionMap(cross-check against--raftGroups)sqsFifoQueueRoutingstruct (PartitionCount + Groups)runtimeConfig.sqsFifoPartitionMapfieldWhat's NOT added (deferred to PR 4-B)
ShardedCoordinatorhtfifocapability advertisement on/sqs_healthkv/lease_state.goleadership-refusal hook for non-htfifo binariesThese are coupled by the design's "binary advertises htfifo only when both routing AND leadership-refusal are in place" rule, so they belong in one PR.
Validation coverage
len(Groups) == PartitionCount--raftGroups=all rejected with the offending entry quoted in the errorTest plan
TestParseSQSFifoPartitionMapcovers grammar, whitespace, multiple queues, all rejection paths.TestValidateSQSFifoPartitionMapcovers the missing-group case with queue + partition-index pointer.parseRuntimeConfigintegration:main_bootstrap_e2e_test.goupdated for the new parameter.go test -race ./adapter/... .pass.golangci-lint ./...clean.Self-review (per CLAUDE.md)
TestParseSQSFifoPartitionMapand 3 inTestValidateSQSFifoPartitionMap, covering every documented rejection branch.