Skip to content

security: build keyspace DDL via SchemaBuilder/CqlIdentifier (follow-up to #2472)#2475

Open
erichare wants to merge 7 commits into
mainfrom
security/cql-injection-schemabuilder-followup
Open

security: build keyspace DDL via SchemaBuilder/CqlIdentifier (follow-up to #2472)#2475
erichare wants to merge 7 commits into
mainfrom
security/cql-injection-schemabuilder-followup

Conversation

@erichare
Copy link
Copy Markdown
Contributor

@erichare erichare commented May 19, 2026

Summary

Follow-up to #2472 (GHSA-p64p-96pw-mxwv). After #2474 reverted #2472 with the reasoning "the actual fix is to use a query builder," this PR is the query-builder fix.

  • CreateKeyspaceOperation and DropKeyspaceOperation now build their SimpleStatement via the driver's SchemaBuilder with CqlIdentifier.fromInternal(name). CqlIdentifier doubles embedded quote characters, so the keyspace identifier stays well-formed regardless of upstream validation — no String.format interpolation remains.
  • CreateKeyspaceOperation takes (name, strategy, strategyOptions) directly; the manual replication-map string-building in CreateNamespaceKeyspaceCommandResolver is gone.
  • The drop resolvers do not apply NamingRules.KEYSPACE.checkRule — addressing the concern in Revert "security: validate keyspace and datacenter names to prevent CQL injection (GHSA-p64p-96pw-mxwv)" #2474 about blocking drops of keyspaces created outside our rules. The SchemaBuilder + CqlIdentifier path makes that resolver-level check unnecessary for safety.

Driver-gap finding for datacenter names

While building this PR I verified — by both decompiling OptionsUtils.extractOptionValue and running a live probe — that the driver's SchemaBuilder.withNetworkTopologyStrategy(Map) does not escape map keys. Repro:

SchemaBuilder.createKeyspace(CqlIdentifier.fromInternal("ks"))
    .withNetworkTopologyStrategy(Map.of("dc'); EVIL --", 1)).asCql();
// → CREATE KEYSPACE ks WITH replication={'class':'NetworkTopologyStrategy','dc'); EVIL --':1}

The DC-name allowlist applied in CreateNamespaceKeyspaceCommandResolver.validateStrategyOptions(...) is therefore the actual security control for the replication map, not defense-in-depth. The resolver comment is updated to flag this so the allowlist isn't accidentally removed in a later cleanup. A regression-pinning test (CreateKeyspaceOperationTest.driverDoesNotEscapeDataCenterMapKeys) documents the driver behaviour so we get a clear signal if a future driver upgrade starts escaping.

Files changed

  • CreateKeyspaceOperation / DropKeyspaceOperationSchemaBuilder + CqlIdentifier
  • CreateKeyspaceCommandResolver / CreateNamespaceCommandResolver → pass strategy + options to the operation; call validateStrategyOptions
  • CreateNamespaceKeyspaceCommandResolver → drop getReplicationMap; keep DC-name validation with an updated rationale comment
  • SchemaException + errors.yaml → re-add UNSUPPORTED_REPLICATION_DATA_CENTER_NAME (reverted in Revert "security: validate keyspace and datacenter names to prevent CQL injection (GHSA-p64p-96pw-mxwv)" #2474)
  • New CreateKeyspaceOperationTest and DropKeyspaceOperationTest exercise the SchemaBuilder behaviour directly
  • CreateKeyspaceCommandResolverTest exercises both the resolver fields and DC-name injection rejection

Test plan

  • ./mvnw test -Dtest='CreateKeyspaceCommandResolverTest,DropKeyspaceCommandResolverTest,CreateKeyspaceOperationTest,DropKeyspaceOperationTest' → 27/27 passing (JDK 21)
  • ./mvnw compile → clean, including google-java-format check (JDK 21)
  • Full CI suite (run as part of PR checks)

@erichare erichare requested a review from a team as a code owner May 19, 2026 01:48
@erichare erichare marked this pull request as draft May 19, 2026 01:50
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

📈 Unit Test Coverage Delta vs Main Branch

Metric Value
Main Branch 51.64%
This PR 51.73%
Delta 🟢 +0.09%
✅ Coverage improved!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Unit Test Coverage Report

Overall Project 51.73% -0.03% 🍏
Files changed 89.26% 🍏

File Coverage
KeyspaceDDLCommandResolver.java 100% 🍏
SchemaObjectNamingRule.java 100% 🍏
SchemaException.java 100% 🍏
CreateKeyspaceOperation.java 100% 🍏
DropKeyspaceOperation.java 100% 🍏
CreateNamespaceCommandResolver.java 95% 🍏
CreateKeyspaceCommand.java 92.31% 🍏
CreateNamespaceCommand.java 85.71% 🍏
DropKeyspaceCommandResolver.java 84.62% 🍏
DropNamespaceCommandResolver.java 84.62% 🍏
CreateKeyspaceCommandResolver.java 0% -80%

…up to #2472)

PR #2472 closed the GHSA-p64p-96pw-mxwv injection vectors with input
validation. This follow-up addresses the root cause for the keyspace-name
sinks: replace the String.format CQL interpolation in CreateKeyspaceOperation
and DropKeyspaceOperation with the driver's SchemaBuilder using
CqlIdentifier.fromInternal(name). CqlIdentifier doubles embedded quote
characters, so the keyspace identifier is well-formed regardless of upstream
validation.

Driver-gap finding for datacenter names: SchemaBuilder.withNetworkTopologyStrategy(Map)
does NOT escape map keys (see OptionsUtils.extractOptionValue in
java-driver-query-builder 4.19.0-preview1 — a hostile DC name with a
single quote produces broken CQL). The DC-name allowlist applied by the
resolver is therefore the actual security control for the replication map,
not just defense-in-depth. The resolver comment is updated to flag this so
the allowlist is not removed in a later cleanup.

Changes:
- CreateKeyspaceOperation: take (name, strategy, strategyOptions) and build
  the SimpleStatement via SchemaBuilder.createKeyspace(CqlIdentifier).
- DropKeyspaceOperation: build via SchemaBuilder.dropKeyspace(CqlIdentifier).
- CreateNamespaceKeyspaceCommandResolver: drop the dead getReplicationMap
  string-building; keep DC-name validation (now wired via
  validateStrategyOptions). Updated comment documents the driver-gap rationale.
- New operation-level tests pin the SchemaBuilder behaviour: identifier
  escaping for keyspace names, fallback to SimpleStrategy, hyphenated DC
  name support, and a regression-pinning test that documents the driver's
  unescaped-map-key behaviour.
- Existing resolver tests adjusted to the new record fields and existing
  injection-rejection tests retained.
@erichare erichare force-pushed the security/cql-injection-schemabuilder-followup branch from 333fd1c to e9097e8 Compare May 19, 2026 01:57
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

📉 Integration Test Coverage Delta vs Main Branch (dse69-it)

Metric Value
Main Branch 72.75%
This PR 72.74%
Delta 🔴 -0.01%
⚠️ Coverage decreased

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Integration Test Coverage Report (dse69-it)

Overall Project 72.74% -0.02% 🍏
Files changed 91.95% 🍏

File Coverage
CreateNamespaceCommand.java 100% 🍏
CreateNamespaceCommandResolver.java 100% 🍏
DropKeyspaceCommandResolver.java 100% 🍏
DropNamespaceCommandResolver.java 100% 🍏
CreateKeyspaceCommandResolver.java 100% 🍏
SchemaException.java 100% 🍏
DropKeyspaceOperation.java 100% 🍏
SchemaObjectNamingRule.java 94.81% -5.19% 🍏
CreateKeyspaceCommand.java 92.31% 🍏
KeyspaceDDLCommandResolver.java 86.67% -13.33% 🍏
CreateKeyspaceOperation.java 86.67% -13.33% 🍏

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

📉 Integration Test Coverage Delta vs Main Branch (hcd-it)

Metric Value
Main Branch 74.18%
This PR 74.16%
Delta 🔴 -0.01%
⚠️ Coverage decreased

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Integration Test Coverage Report (hcd-it)

Overall Project 74.16% -0.02% 🍏
Files changed 91.95% 🍏

File Coverage
CreateNamespaceCommand.java 100% 🍏
CreateNamespaceCommandResolver.java 100% 🍏
DropKeyspaceCommandResolver.java 100% 🍏
DropNamespaceCommandResolver.java 100% 🍏
CreateKeyspaceCommandResolver.java 100% 🍏
SchemaException.java 100% 🍏
DropKeyspaceOperation.java 100% 🍏
SchemaObjectNamingRule.java 94.81% -5.19% 🍏
CreateKeyspaceCommand.java 92.31% 🍏
KeyspaceDDLCommandResolver.java 86.67% -13.33% 🍏
CreateKeyspaceOperation.java 86.67% -13.33% 🍏

Copy link
Copy Markdown
Contributor

@amorton amorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not convinced the rules for the DC are actually correct. That needs to be validated in the C* code

We can do some validation on the dropKeyspace to check the identifier is a regex word char, but not to check the length

some more tweaks to bring the old collection code up to a better standard.

and the comments, um, need a human touch

@erichare erichare marked this pull request as ready for review May 19, 2026 21:20
@erichare erichare requested a review from amorton May 19, 2026 21:20
Copy link
Copy Markdown
Contributor

@amorton amorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some iteration, particularly around the santizing Vs checking name rules for deletion

}

static CqlIdentifier keyspaceIdentifierForDrop(String name) {
if (name == null || !DROP_KEYSPACE_NAME.matcher(name).matches()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should validate,but we cannot contrain the length because in HCD and C* the keyspac ename can be longer than astra. And it could be made in CQL .

So we need perhaps something on the nameing rules that is sanitizeInput() ? that will check the chars are from the allowed set of char, but not the length

it should prob throw the same or a similar exception inside sanitizeInput() as we do in checkRule()

Comment on lines 42 to 43
@Pattern(regexp = "(?i)(SimpleStrategy|NetworkTopologyStrategy)")
@JsonProperty("class")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code in cassandra for getting the replication strategy is case sensitive, see AbstractReplicationStrategy.getClass().

SO we should either keep it that way here, OR we need to fix it later.

lets go with leaving this without the (?i) because we do not know how to capitalize, and tehcnically this can be any fully qualified path.

public record Replication(
@NotNull()
@Pattern(regexp = "SimpleStrategy|NetworkTopologyStrategy")
@Pattern(regexp = "(?i)(SimpleStrategy|NetworkTopologyStrategy)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as keyspace, -1 on making case sensitive

UNSUPPORTED_JSON_TYPE_FOR_TEXT_INDEX,
UNSUPPORTED_LIST_DEFINITION,
UNSUPPORTED_MAP_DEFINITION,
UNSUPPORTED_REPLICATION_DATA_CENTER_NAME,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may actually be INVALID, for unsupported it is "we understand what you want to do, but you cannot do that" , e.g. see INVALID_USER_DEFINED_TYPE_NAME above

start.withNetworkTopologyStrategy(strategyOptions != null ? strategyOptions : Map.of());
} else {
int replicationFactor =
(strategyOptions != null)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, we marked the options as @NotNull in the command class, but this class does not know and needs to be defensive in it's inputs. So we shoud ensure that all of the values in the map or not null for this check and above.

return DEFAULT_REPLICATION_MAP;
// The driver writes NetworkTopologyStrategy map keys as CQL string literals without escaping
// them.
private static final Pattern VALID_DATA_CENTER_NAME = Pattern.compile("[^']+");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, we need a thought out reason why this is the correct pattern to sanitize with. The code to reference from C* will be the NetworkTopologyStrategy

for example, a CQL quoted identifier is delim't with " so should we care about that ?

is this a better idea ?
["'\u2018\u2019\u201C\u201D] That covers: " double, ' single, backtick, ' ' curly single, " " curly double.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://cassandra.apache.org/doc/stable/cassandra/developing/cql/ddl.html#create-keyspace

NTS option keys are data-center names in a CQL map, not CQL identifiers, and the driver emits them as raw single-quoted CQL string keys. So ASCII ' is the delimiter we must reject; "/backtick/curly quotes do not close the string literal in that position.

import java.util.Map;
import java.util.regex.Pattern;

final class KeyspaceCommandResolverSupport {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure on why we need a new class here, and genally a "Support" class with static functions would need to be more widely usable. The pattern with the CreateKeyspace/Namespace is a base class.

We also now have logic spread out for the DDL operations for a Keyspace.

Consider renaming CreateNamespaceKeyspaceCommandResolver to KeyspaceDDLCommandResolver and move all the helper functions into there.

Sublass all the Create and Delete for keyspace / namespace off the same class

@erichare erichare requested a review from amorton May 21, 2026 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants