Skip to content

Address security-scan findings from CodeQL java-security-extended suite and the Semgrep/CodeQL findings#1028

Merged
akshayrai merged 1 commit into
linkedin:masterfrom
akshayrai:codeql-fixes
Jun 4, 2026
Merged

Address security-scan findings from CodeQL java-security-extended suite and the Semgrep/CodeQL findings#1028
akshayrai merged 1 commit into
linkedin:masterfrom
akshayrai:codeql-fixes

Conversation

@akshayrai
Copy link
Copy Markdown
Collaborator

Summary

Reviewed OSS Brooklin against the CodeQL java-security-extended suite and the Semgrep/CodeQL findings.
Fixes the genuine, low-risk issues with code changes and documents/suppresses the false-positive and compatibility-bound findings with justifications in code. Fixed:

  • FileUtils.constructRandomDirectoryInTempDir created a predictably named directory under the shared system temp dir via new File(java.io.tmpdir, prefix + random) + mkdirs(), which is world-readable and racy (CWE-379, local information disclosure). Replaced with Files.createTempDirectory(), which atomically creates the directory with owner-only permissions. Callers (EmbeddedZookeeper, EmbeddedKafkaCluster) only use the returned path, so behavior is preserved; the now-unused Random field/import is removed.
  • TestAssignmentTaskMapLogger.createCustomSizeString compared an int loop counter directly against a double bound (java/comparison-with-wider-type); the target size is now converted to int. Documented / suppressed with justification (false positive or compatibility-bound):
  • NetworkUtils.getAvailablePort (Semgrep unencrypted-socket): the ServerSocket is bound to an ephemeral port only to discover a free port number and is closed immediately; no data is ever transmitted, so transport encryption does not apply. False positive.
  • GroupIdConstructor.hashGroupId and AvroMessageEncoderUtil.md5 (Semgrep use-of-md5, java/potentially-weak-cryptographic-algorithm): MD5 is used as a non-cryptographic identity hash, not a security signature. The group-id hash determines the live Kafka consumer group.id and the schema-id hash is embedded in the on-wire message envelope; changing the algorithm would reset consumer offsets and break decoding of existing data. Must remain MD5 for compatibility. Each suppressed site carries an explanatory comment and a // nosemgrep directive so the rationale travels with the code (and the brooklin-server in-tree port stays clean). Verified on JDK 8: TestAssignmentTaskMapLogger and TestZkClient (which exercises the changed FileUtils via EmbeddedZookeeper) pass; checkstyle passes for the touched modules.

Testing Done

Build passes

…lin-server in-tree Semgrep)

Reviewed OSS Brooklin against the CodeQL java-security-extended suite and the Semgrep/CodeQL
findings raised while vendoring it into brooklin-server (linkedin-multiproduct/brooklin-
server#3352).
Fixes the genuine, low-risk issues with code changes and documents/suppresses the false-positive
and compatibility-bound findings with justifications in code.
Fixed:
- FileUtils.constructRandomDirectoryInTempDir created a predictably named directory under the
shared system temp dir via new File(java.io.tmpdir, prefix + random) + mkdirs(), which is
world-readable and racy (CWE-379, local information disclosure). Replaced with
Files.createTempDirectory(), which atomically creates the directory with owner-only permissions.
Callers (EmbeddedZookeeper, EmbeddedKafkaCluster) only use the returned path, so behavior is
preserved; the now-unused Random field/import is removed.
- TestAssignmentTaskMapLogger.createCustomSizeString compared an int loop counter directly against
a double bound (java/comparison-with-wider-type); the target size is now converted to int.
Documented / suppressed with justification (false positive or compatibility-bound):
- NetworkUtils.getAvailablePort (Semgrep unencrypted-socket): the ServerSocket is bound to an
ephemeral port only to discover a free port number and is closed immediately; no data is ever
transmitted, so transport encryption does not apply. False positive.
- GroupIdConstructor.hashGroupId and AvroMessageEncoderUtil.md5 (Semgrep use-of-md5,
java/potentially-weak-cryptographic-algorithm): MD5 is used as a non-cryptographic identity
hash, not a security signature. The group-id hash determines the live Kafka consumer group.id
and the schema-id hash is embedded in the on-wire message envelope; changing the algorithm would
reset consumer offsets and break decoding of existing data. Must remain MD5 for compatibility.
Each suppressed site carries an explanatory comment and a // nosemgrep directive so the rationale
travels with the code (and the brooklin-server in-tree port stays clean).
Verified on JDK 8: TestAssignmentTaskMapLogger and TestZkClient (which exercises the changed
FileUtils via EmbeddedZookeeper) pass; checkstyle passes for the touched modules.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@akshayrai akshayrai marked this pull request as ready for review June 4, 2026 03:29
Copy link
Copy Markdown
Collaborator

@harshcum harshcum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akshayrai akshayrai merged commit 158cc7c into linkedin:master Jun 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants