feat(scanning): audit GitHub Actions workflows across repositories by nakberli841-bot · Pull Request #33 · Devlaner/cleat

nakberli841-bot · 2026-06-26T19:38:55Z

What does this change?

Description

Adds automated auditing of GitHub Actions workflows across all connected repositories.

What was done

Created workflow_scan table via Flyway V2 migration with FK to repo table
Added WorkflowScanEntity and WorkflowScanRepository to persist scan results
Added RepoRepository for repository-level DB access
Implemented WorkflowParser in libs/scanning to parse workflow YAML files and flag:
- Unpinned actions (missing SHA pin)
- Broad permissions (write-all / read-all)
- Missing OIDC (id-token: write)
Implemented risk scoring (0–100) based on flagged issues
Implemented WorkflowScanService to fetch workflow YAMLs from GitHub API and persist results
Added WorkflowScanJob in apps/worker as a scheduled job that scans all repos hourly
Exposed GET /api/repos/{repoId}/workflow-scans endpoint in apps/api

Tests

WorkflowParserTest — unit tests covering unpinned actions, broad permissions, missing OIDC and clean workflow scenarios
WorkflowScanRepositoryTest — integration tests with Testcontainers verifying DB read/write and delete operations
WorkflowScanControllerTest — full stack integration tests verifying endpoint response for populated and empty scan results

Related modules

libs/persistence
libs/scanning
apps/worker
apps/api

Related issue

Closes # 10

Area

Frontend (apps/web)
Backend
Docs
Other

Screenshots

If this touches the UI, drop before and after screenshots here.

Checklist

I ran bun run typecheck && bun run build in apps/web (for frontend changes)
I bumped the version if this is a meaningful change
Commits are small and focused, with no AI attribution lines
I updated docs or AGENTS.md if behavior or structure changed

Summary by CodeRabbit

New Features
- Added a new API endpoint to fetch the latest workflow scan results for a repository (including workflow path, risk score, and detected risk signals).
- The worker now runs on a fixed interval to scan repository workflows and persist results.
Bug Fixes
- Workflow scans now refresh stored results for a repository each run, and processing continues even if an individual repository fails.
Tests
- Added integration tests for the repository endpoint and workflow-scan persistence behavior.

- Add V2 Flyway migration for workflow_scan table with FK to repo - Add WorkflowScanEntity and WorkflowScanRepository - Add RepoRepository - Implement WorkflowParser to flag unpinned actions, broad permissions and missing OIDC with 0-100 risk scoring - Implement WorkflowScanService to fetch workflow YAMLs from GitHub and persist scan results - Add WorkflowScanJob scheduled worker to scan all repos hourly - Expose GET /api/repos/{repoId}/workflow-scans endpoint - Add unit tests for WorkflowParser - Add integration tests for WorkflowScanRepository and WorkflowScanController

coderabbitai · 2026-06-26T19:39:08Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4933f90b-a904-4014-8750-446b315c3a0c

📥 Commits

Reviewing files that changed from the base of the PR and between 269ac17 and e431f7a.

📒 Files selected for processing (1)

backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java

🚧 Files skipped from review as they are similar to previous changes (1)

backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java

📝 Walkthrough

Walkthrough

Adds workflow scan storage, parsing, scheduled execution, and an API endpoint to read stored scans for a repository. It also updates persistence mappings, migrations, worker wiring, and test configuration to support the new flow.

Changes

Workflow scan feature

Layer / File(s)	Summary
Persistence model and schema `backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/AccountEntity.java`, `RepoEntity.java`, `WorkflowScanEntity.java`, `backend/libs/persistence/src/main/java/dev/cleat/persistence/repository/WorkflowScanRepository.java`, `backend/libs/persistence/src/main/resources/db/migration/V1__...sql`, `V2__...sql`, `V3__...sql`, `backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java`	Adds `installationId` to `AccountEntity`, moves `RepoEntity` to the entity subpackage, introduces `WorkflowScanEntity` mapped to `workflow_scan`, adds `WorkflowScanRepository` with find/delete methods, creates the table migration, and verifies both repository behaviors with Testcontainers integration tests.
Workflow YAML parsing and scan service `backend/libs/scanning/build.gradle.kts`, `backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowAnalysis.java`, `WorkflowParser.java`, `WorkflowScanService.java`, `backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java`	Adds SnakeYAML, defines the `WorkflowAnalysis` record, implements `WorkflowParser` with pinned-SHA detection and risk scoring, implements `WorkflowScanService` to fetch, parse, and persist scan rows, and covers parser behaviors with unit tests.
Worker app wiring and scheduling `backend/apps/worker/src/main/java/dev/cleat/worker/CleatWorkerApplication.java`, `WorkflowScanJob.java`, `backend/apps/worker/src/main/resources/application.yml`, `backend/apps/worker/src/test/java/dev/cleat/worker/AbstractIntegrationTest.java`, `backend/apps/worker/src/test/resources/application.yml`, `backend/libs/github-client/src/main/java/dev/cleat/githubclient/config/GitHubConfig.java`	Adds explicit entity, repository, and component scanning to the worker app, introduces the scheduled `WorkflowScanJob`, sets `cleat.workflow-scan.interval-ms`, marks the `RedisTemplate` bean `@Primary`, and aligns worker test configuration.
API read endpoint and integration tests `backend/apps/api/src/main/java/dev/cleat/api/controller/WorkflowScanController.java`, `backend/apps/api/src/test/java/dev/cleat/api/AbstractIntegrationTest.java`, `WorkflowScanControllerTest.java`, `backend/apps/api/src/test/resources/application.yml`	Adds `GET /api/repos/{repoId}/workflow-scans` returning a list of `WorkflowScanResponse` records, switches the test base class to `RANDOM_PORT`, configures in-memory test security credentials, and covers both populated and empty cases with integration tests.

Sequence Diagram(s)

sequenceDiagram
  participant WorkflowScanJob
  participant AccountRepository
  participant WorkflowScanService
  participant GitHubClient
  participant WorkflowParser
  participant WorkflowScanRepository

  WorkflowScanJob->>AccountRepository: findAll()
  loop each account repo
    WorkflowScanJob->>WorkflowScanService: scanRepo(repo, installationId)
    WorkflowScanService->>GitHubClient: fetch .github/workflows listing
    WorkflowScanService->>WorkflowScanRepository: deleteByRepoId(repoId)
    loop each .yml/.yaml file
      WorkflowScanService->>GitHubClient: download raw YAML
      WorkflowScanService->>WorkflowParser: parse(workflowPath, yamlContent)
      WorkflowScanService->>WorkflowScanRepository: save(WorkflowScanEntity)
    end
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Supply chain: Actions workflow audit #10 — The PR implements workflow audit scanning, scoring, storage, scheduling, and readback paths described by the issue.

Suggested reviewers

martian56

Poem

🐇 I hop through workflows, quick and bright,
Parsing YAML by moonlit light.
Stored scans pile up, neat and true,
One hop, one score, one API view.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: auditing GitHub Actions workflows across repositories.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/10-github-actions-workflow-audit

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (3)

backend/libs/persistence/src/main/resources/db/migration/V2__create_workflow_scan_table.sql (1)
17-17: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Consider a composite index aligned with the read query.

The primary read path findByRepoIdOrderByScannedAtDesc filters on repo_id and orders by scanned_at DESC. A composite (repo_id, scanned_at DESC) index lets Postgres satisfy both the filter and the sort without a separate sort step, which the current repo_id-only index does not.
⚡ Suggested index
-CREATE INDEX idx_workflow_scan_repo_id ON workflow_scan(repo_id);
+CREATE INDEX idx_workflow_scan_repo_id_scanned_at
+    ON workflow_scan(repo_id, scanned_at DESC);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@backend/libs/persistence/src/main/resources/db/migration/V2__create_workflow_scan_table.sql`
at line 17, The current workflow_scan index only covers repo_id, but the main
read path in the repository query findByRepoIdOrderByScannedAtDesc also sorts by
scanned_at DESC. Update the migration to use a composite index on workflow_scan
that matches the filter and ordering, referencing the existing
idx_workflow_scan_repo_id definition so Postgres can satisfy both without an
extra sort.
backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java (1)
54-82: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Ordering is not actually asserted.

savesAndFindsWorkflowScanByRepoId persists a single scan, so findByRepoIdOrderByScannedAtDesc would pass regardless of sort direction. Consider inserting at least two scans with distinct scannedAt values and asserting the newest comes first to cover the OrderByScannedAtDesc contract that the API endpoint relies on.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java`
around lines 54 - 82, The test in WorkflowScanRepositoryTest for
savesAndFindsWorkflowScanByRepoId does not actually verify the
OrderByScannedAtDesc behavior because it only persists one WorkflowScanEntity.
Update this test to save at least two scans for the same RepoEntity with
different scannedAt values, then assert that
workflowScanRepository.findByRepoIdOrderByScannedAtDesc returns the newest
WorkflowScanEntity first. Keep the existing repository method name and add
assertions on the first and second results to cover the intended ordering
contract.
backend/libs/persistence/src/main/java/dev/cleat/persistence/repo/WorkflowScanRepository.java (1)
12-12: 🩺 Stability & Availability | 🔵 Trivial

Consider a bulk delete here deleteByRepoId loads matching WorkflowScanEntity rows and removes them one by one. A custom @Modifying @query("delete ...") would avoid the extra reads if this repo can grow large.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@backend/libs/persistence/src/main/java/dev/cleat/persistence/repo/WorkflowScanRepository.java`
at line 12, The deleteByRepoId method in WorkflowScanRepository currently relies
on entity-by-entity removal, which can cause unnecessary reads for large
repositories. Update this repository method to use a bulk delete approach with a
custom `@Modifying` query so matching WorkflowScanEntity rows are deleted directly
by repoId. Keep the method name as the entry point, but change the repository
contract/annotations so the delete executes as a single bulk operation rather
than loading entities first.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/apps/api/src/main/java/dev/cleat/api/CleatApiApplication.java`:
- Around line 7-8: The API application only enables repository scanning via
CleatApiApplication, so JPA entities under dev.cleat.persistence.entity are not
being discovered. Add entity scanning alongside the existing
`@EnableJpaRepositories` on CleatApiApplication, using `@EntityScan` with the
dev.cleat.persistence base package (matching CleatWorkerApplication) or
equivalent JPA entity scan configuration.

In
`@backend/apps/worker/src/main/java/dev/cleat/worker/CleatWorkerApplication.java`:
- Around line 12-14: The worker application scan setup is missing the
github-client module, so `WorkflowScanService` cannot resolve
`dev.cleat.githubclient.service.GitHubClient` at startup. Update
`CleatWorkerApplication` so its component scanning includes the
`dev.cleat.githubclient` package (or imports the github-client configuration
explicitly), alongside the existing `dev.cleat.persistence` and
`dev.cleat.scanning` setup, to ensure `WorkflowScanJob` can wire its
dependencies.

In `@backend/apps/worker/src/main/java/dev/cleat/worker/WorkflowScanJob.java`:
- Around line 25-38: `WorkflowScanJob.run()` is iterating detached
`AccountEntity`/`RepoEntity` instances, so the lazy `account.getRepos()` and
later `repo.getAccount().getLogin()` access can fail outside an active
persistence context. Fix this by loading the account/repo graph eagerly in
`accountRepository.findAll()` (for example with a `JOIN FETCH` or
`@EntityGraph`) or by moving the entire scan loop in `run()` into a
transactional boundary so `workflowScanService.scanRepo(...)` receives managed
entities.

In `@backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowParser.java`:
- Around line 36-37: The permissions analysis in WorkflowParser has two coverage
gaps: job-level broad permission presets are not detected, and top-level scalar
write-all/read-all is treated as missing OIDC. Update the logic around the
permissions parsing and broadPermissions/missingOidc computation so both the
top-level permissions block and each job’s permissions are checked for
write-all/read-all, and treat write-all as implying id-token: write when setting
missingOidc. Use the existing WorkflowParser permissions handling and the
job-level inspection code to centralize this detection so the same rules apply
consistently.
- Around line 20-21: The YAML parsing in WorkflowParser currently uses the
default Yaml constructor and load path, which is not safe for untrusted workflow
files. Update WorkflowParser’s YAML creation to use SnakeYAML’s safe constructor
or safe loader configuration, and keep the parsing logic in the same place where
yaml.load(yamlContent) is called so external repository content is handled with
the restricted type set.

In
`@backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowScanService.java`:
- Around line 42-48: In WorkflowScanService, stale scan rows are left behind
because the empty/null files early return happens before
workflowScanRepository.deleteByRepoId. Move the deleteByRepoId(repo.getId())
call so it runs before the files empty check in the scan workflow path, ensuring
old scans are cleared even when no workflows remain; keep the existing logic in
findByRepoIdOrderByScannedAtDesc unaffected.
- Line 60: Handle absolute download_url values separately in
WorkflowScanService: GitHubClient.get currently treats every input as a path on
api.github.com, so the contents API download_url is being built incorrectly.
Update the call site around the yamlContent fetch to use a request path/URI flow
that preserves absolute URLs, or add a separate GitHubClient method for absolute
URLs and use that here while keeping the bearer token attachment intact.

In
`@backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java`:
- Around line 77-82: The test name in WorkflowParserTest is misleading because
emptyYamlReturnsZeroScore asserts a risk score of 20, not 0. Rename the test
method to reflect the actual behavior, such as emptyYamlScoresOnlyMissingOidc,
and keep the existing assertions in parser.parse and the WorkflowAnalysis
riskScore check unchanged.

---

Nitpick comments:
In
`@backend/libs/persistence/src/main/java/dev/cleat/persistence/repo/WorkflowScanRepository.java`:
- Line 12: The deleteByRepoId method in WorkflowScanRepository currently relies
on entity-by-entity removal, which can cause unnecessary reads for large
repositories. Update this repository method to use a bulk delete approach with a
custom `@Modifying` query so matching WorkflowScanEntity rows are deleted directly
by repoId. Keep the method name as the entry point, but change the repository
contract/annotations so the delete executes as a single bulk operation rather
than loading entities first.

In
`@backend/libs/persistence/src/main/resources/db/migration/V2__create_workflow_scan_table.sql`:
- Line 17: The current workflow_scan index only covers repo_id, but the main
read path in the repository query findByRepoIdOrderByScannedAtDesc also sorts by
scanned_at DESC. Update the migration to use a composite index on workflow_scan
that matches the filter and ordering, referencing the existing
idx_workflow_scan_repo_id definition so Postgres can satisfy both without an
extra sort.

In
`@backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java`:
- Around line 54-82: The test in WorkflowScanRepositoryTest for
savesAndFindsWorkflowScanByRepoId does not actually verify the
OrderByScannedAtDesc behavior because it only persists one WorkflowScanEntity.
Update this test to save at least two scans for the same RepoEntity with
different scannedAt values, then assert that
workflowScanRepository.findByRepoIdOrderByScannedAtDesc returns the newest
WorkflowScanEntity first. Keep the existing repository method name and add
assertions on the first and second results to cover the intended ordering
contract.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 915914f2-ad1f-4a0b-8769-0468385c27fa

📥 Commits

Reviewing files that changed from the base of the PR and between 066f931 and db5a09b.

📒 Files selected for processing (24)

backend/apps/api/src/main/java/dev/cleat/api/CleatApiApplication.java
backend/apps/api/src/main/java/dev/cleat/api/controller/WorkflowScanController.java
backend/apps/api/src/test/java/dev/cleat/api/AbstractIntegrationTest.java
backend/apps/api/src/test/java/dev/cleat/api/WorkflowScanControllerTest.java
backend/apps/worker/src/main/java/dev/cleat/worker/CleatWorkerApplication.java
backend/apps/worker/src/main/java/dev/cleat/worker/WorkflowScanJob.java
backend/apps/worker/src/main/resources/application.yml
backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/AccountEntity.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/RepoEntity.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/WorkflowScanEntity.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/enums/AccountType.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/enums/Plan.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/enums/Visibility.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/repo/AccountRepository.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/repo/RepoRepository.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/repo/WorkflowScanRepository.java
backend/libs/persistence/src/main/resources/db/migration/V2__create_workflow_scan_table.sql
backend/libs/persistence/src/test/java/dev/cleat/persistence/AccountRepositoryTest.java
backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java
backend/libs/scanning/build.gradle.kts
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowAnalysis.java
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowParser.java
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowScanService.java
backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java

coderabbitai · 2026-06-26T19:46:36Z

+        Yaml yaml = new Yaml();
+        Map<String, Object> workflow = yaml.load(yamlContent);


🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What is the default tag/type instantiation behavior of SnakeYAML 2.x new Yaml().load and is it safe for untrusted input?

💡 Result:

In SnakeYAML 2.x, the default behavior of new Yaml().load() is "secure by default," which represents a significant change from 1.x versions [1][2][3]. By default, global tags (which allow the instantiation of arbitrary Java classes) are disabled [2][4][3]. Key technical aspects include: 1. Default Constructor: In 2.x, new Yaml() uses a constructor that restricts class instantiation [1]. It defaults to using an UnTrustedTagInspector that rejects all global tags, preventing the arbitrary class instantiation that previously led to remote code execution (RCE) vulnerabilities [4][3]. 2. Safety for Untrusted Input: Because global tags are disabled by default, new Yaml().load() is considered safe for parsing untrusted input insofar as it will not instantiate arbitrary, potentially malicious Java objects defined by those tags [2][3]. 3. Allowed Types: The default behavior allows only standard, safe YAML types (e.g., scalars, sequences, and mappings), which are handled by SafeConstructor [5][6][1]. 4. Security Best Practices: While the default configuration is now secure against common tag-based injection, developers should still exercise caution with untrusted input [7][3]. This includes limiting the size of the input to prevent denial-of-service (DoS) attacks via memory exhaustion [7][3]. If specific custom classes must be deserialized, developers should explicitly configure a TagInspector to whitelist only trusted packages or classes rather than globally enabling all tags [4][3]. In summary, new Yaml().load() in SnakeYAML 2.x is designed to prevent the deserialization vulnerabilities associated with earlier versions by default, making it safe to use for untrusted input provided that no additional, insecure configuration is explicitly enabled [1][2][3].

Citations:

1: https://snyk.io/blog/snakeyaml-unsafe-deserialization-vulnerability/

2: https://docs.herodevs.com/snakeyaml/cve-2022-1471

3: https://deepwiki.com/snakeyaml/snakeyaml/7-security

4: https://deepwiki.com/snakeyaml/snakeyaml/7.3-type-safety-and-class-loading

5: https://deepwiki.com/snakeyaml/snakeyaml/3-loading-(deserialization)-pipeline

6: https://deepwiki.com/snakeyaml/snakeyaml/3.5-construction-and-object-creation

7: https://codeberg.org/snakeyaml/snakeyaml/wiki/CVE_N_NIST

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== WorkflowParser ==" sed -n '1,220p' backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowParser.java echo echo "== Relevant build files ==" git ls-files 'backend/**/build.gradle*' 'backend/**/settings.gradle*' 'backend/**/gradle.properties' 'backend/**/libs/**/build.gradle*' | sed -n '1,200p' echo echo "== Search for snakeyaml / spring boot versions ==" rg -n --hidden --glob '!**/build/**' --glob '!**/target/**' --glob '!**/.gradle/**' 'snakeyaml|spring-boot|Boot BOM|org.yaml' backend

Repository: Devlaner/cleat

Length of output: 6546

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Root backend build files ==" git ls-files 'backend/build.gradle*' 'backend/settings.gradle*' 'backend/gradle.properties' 'backend/gradle/libs.versions.toml' | sed -n '1,200p' echo for f in backend/build.gradle.kts backend/gradle/libs.versions.toml backend/settings.gradle.kts backend/gradle.properties; do if [ -f "$f" ]; then echo "== $f ==" sed -n '1,220p' "$f" echo fi done

Repository: Devlaner/cleat

Length of output: 2151

🌐 Web query:

Spring Boot 3.5.3 managed SnakeYAML version

💡 Result:

Spring Boot 3.5.3 manages the SnakeYAML dependency version as 1.33 [1]. Consistent with Spring Boot's policy of maintaining stability within minor versions, Spring Boot 3.5.x continues to utilize the 1.33 release of SnakeYAML, rather than upgrading to the 2.x line by default [2][3][1]. If your application requires a different version of SnakeYAML (for example, to use features from the 2.x release), you can override the managed version using the snakeyaml.version property in your build configuration [4][1]. For Maven projects, you can add this to your pom.xml: <snakeyaml.version>2.x</snakeyaml.version> For Gradle projects, you can add this to your build.gradle: ext['snakeyaml.version'] = '2.x' Spring Boot 3.5.3 was released on June 20, 2025 [5][6]. While newer versions of SnakeYAML exist, Spring Boot manages this dependency to ensure compatibility with its internal YAML processing, and it is recommended to override this property only if you have a specific requirement to use a different version [2][1].

Citations:

1: https://www.tutorialpedia.org/blog/is-snakeyaml-2-0-added-in-the-new-spring-boot-versions/

2: CVE-2022-25857 - Upgrade to SnakeYAML 1.31 spring-projects/spring-boot#32221

3: Spring Boot 2.7.x: Upgrade org.yaml/snakeyaml from 1.30 to 1.33 spring-projects/spring-boot#35401

4: https://docs.spring.io/spring-boot/3.5/appendix/dependency-versions/properties.html

5: https://github.com/spring-projects/spring-boot/releases/tag/v3.5.3

6: https://spring.io/blog/2025/06/19/spring-boot-3-5-3-available-now

🌐 Web query:

Spring Boot 3.5.3 managed SnakeYAML version

💡 Result:

Spring Boot 3.5.3 manages SnakeYAML at version 1.33 [1]. While Spring Boot 3.5.x versions are compatible with SnakeYAML 2.x, they continue to default to the 1.33 version to ensure stability and backward compatibility [1][2]. If you require features from SnakeYAML 2.x, you can override the managed version by defining the snakeyaml.version property in your project's build configuration [3][1]. For Maven projects, add the following to your pom.xml: <snakeyaml.version>2.x</snakeyaml.version> For Gradle projects, add this to your build.gradle: ext['snakeyaml.version'] = '2.x' This will instruct Spring Boot to use your specified version instead of the default 1.33 [3][1].

Citations:

1: https://www.tutorialpedia.org/blog/is-snakeyaml-2-0-added-in-the-new-spring-boot-versions/

2: Snakeyaml 2.0 is not compatible with Spring boot 3.0.5 spring-projects/spring-boot#35064

3: CVE-2022-25857 - Upgrade to SnakeYAML 1.31 spring-projects/spring-boot#32221

Use SnakeYAML’s safe constructor here.

Spring Boot 3.5.3 still manages SnakeYAML 1.33, so new Yaml().load(...) here is not safe for untrusted workflow files from external repositories. Switch to a safe loader / SafeConstructor instead.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowParser.java` around lines 20 - 21, The YAML parsing in WorkflowParser currently uses the default Yaml constructor and load path, which is not safe for untrusted workflow files. Update WorkflowParser’s YAML creation to use SnakeYAML’s safe constructor or safe loader configuration, and keep the parsing logic in the same place where yaml.load(yamlContent) is called so external repository content is handled with the restricted type set.

…rApplication

- Added installationId field (String) to AccountEntity - Updated V1 migration script to include installation_id column - Fixed compilation and consistency issues across modules

- Fix JUnit Platform launcher dependency in :libs:scanning - Resolve NoUniqueBeanDefinitionException in :apps:worker test context - Update API integration tests to match updated AccountEntity structure

…packages

…ex SHA

…UniqueBeanDefinitionException

…roperties

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/apps/worker/src/test/resources/application.yml`:
- Around line 5-7: Disable scheduling in the worker test configuration so
`WorkflowScanJob` does not run during test startup. Update the test
`application.yml` used by `CleatWorkerApplication` to set
`spring.task.scheduling.enabled` to false while keeping the existing
`cleat.workflow-scan.interval-ms` settings as-is. This change should be made in
the test resource config that defines the worker application properties.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0e6f105d-544c-43f1-9716-bfe4d2ac22e6

📥 Commits

Reviewing files that changed from the base of the PR and between db5a09b and 035e1e2.

📒 Files selected for processing (18)

backend/apps/api/src/main/java/dev/cleat/api/controller/WorkflowScanController.java
backend/apps/api/src/test/java/dev/cleat/api/WorkflowScanControllerTest.java
backend/apps/api/src/test/resources/application.yml
backend/apps/worker/src/main/java/dev/cleat/worker/CleatWorkerApplication.java
backend/apps/worker/src/main/java/dev/cleat/worker/WorkflowScanJob.java
backend/apps/worker/src/test/java/dev/cleat/worker/AbstractIntegrationTest.java
backend/apps/worker/src/test/resources/application.yml
backend/libs/github-client/src/main/java/dev/cleat/githubclient/config/GitHubConfig.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/AccountEntity.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/RepoEntity.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/repository/WorkflowScanRepository.java
backend/libs/persistence/src/main/resources/db/migration/V1__create_account_and_repo_tables.sql
backend/libs/persistence/src/main/resources/db/migration/V3__update_repo_and_add_tables.sql
backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java
backend/libs/scanning/build.gradle.kts
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowParser.java
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowScanService.java
backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java

💤 Files with no reviewable changes (1)

backend/apps/worker/src/test/java/dev/cleat/worker/AbstractIntegrationTest.java

✅ Files skipped from review due to trivial changes (2)

backend/libs/persistence/src/main/java/dev/cleat/persistence/repository/WorkflowScanRepository.java
backend/libs/persistence/src/main/resources/db/migration/V3__update_repo_and_add_tables.sql

🚧 Files skipped from review as they are similar to previous changes (10)

backend/apps/api/src/main/java/dev/cleat/api/controller/WorkflowScanController.java
backend/apps/api/src/test/java/dev/cleat/api/WorkflowScanControllerTest.java
backend/libs/scanning/build.gradle.kts
backend/libs/persistence/src/test/java/dev/cleat/persistence/WorkflowScanRepositoryTest.java
backend/apps/worker/src/main/java/dev/cleat/worker/CleatWorkerApplication.java
backend/apps/worker/src/main/java/dev/cleat/worker/WorkflowScanJob.java
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowScanService.java
backend/libs/scanning/src/main/java/dev/cleat/scanning/WorkflowParser.java
backend/libs/scanning/src/test/java/dev/cleat/scanning/WorkflowParserTest.java
backend/libs/persistence/src/main/java/dev/cleat/persistence/entity/RepoEntity.java

coderabbitai · 2026-06-28T16:05:40Z

+cleat:
+  workflow-scan:
+    interval-ms: 3600000


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail fd 'CleatWorkerApplication\.java$' backend/apps/worker/src/main/java -x sh -c ' echo "== $1 =="; ast-grep outline "$1"; sed -n "1,160p" "$1" ' sh {} fd 'AbstractIntegrationTest\.java$' backend/apps/worker/src/test/java -x sh -c ' echo "== $1 =="; ast-grep outline "$1"; sed -n "1,220p" "$1" ' sh {} rg -n -C2 '`@EnableScheduling`|`@Scheduled`|spring\.task\.scheduling\.enabled|workflow-scan\.interval-ms' backend/apps/worker

Repository: Devlaner/cleat

Length of output: 3224

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== test resources ==" git ls-files 'backend/apps/worker/src/test/resources/**' | sort echo echo "== scheduling-related config in worker tests ==" rg -n -C2 'spring\.task\.scheduling\.enabled|EnableScheduling|`@Scheduled`|workflow-scan\.interval-ms' \ backend/apps/worker/src/test backend/apps/worker/src/main echo echo "== worker test classes ==" fd -e java backend/apps/worker/src/test/java -x sh -c 'echo "--- $1 ---"; sed -n "1,220p" "$1"' sh {}

Repository: Devlaner/cleat

Length of output: 2311

Disable the scheduler in worker tests
CleatWorkerApplication enables scheduling globally, and this test config leaves cleat.workflow-scan.interval-ms active, so WorkflowScanJob can run when the test context starts. Add spring.task.scheduling.enabled: false here to keep integration tests deterministic.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/apps/worker/src/test/resources/application.yml` around lines 5 - 7, Disable scheduling in the worker test configuration so `WorkflowScanJob` does not run during test startup. Update the test `application.yml` used by `CleatWorkerApplication` to set `spring.task.scheduling.enabled` to false while keeping the existing `cleat.workflow-scan.interval-ms` settings as-is. This change should be made in the test resource config that defines the worker application properties.

martian56

I went through the whole diff and every CodeRabbit thread, and checked each one against the current code rather than trusting the resolved flags. Short version: the valid CodeRabbit points are still open. Nine threads, two resolved, seven still active, and six of those seven are real and unaddressed. Two of them are not minor, they are bugs that only surface when the scheduled job actually runs, which is exactly why CI stays green.

The foundation is good. The parser structure, the risk scoring, the entity and migration, the endpoint, and the parser unit tests are all reasonable. But the end to end scan path does not work yet, and that is the part the tests do not exercise.

The two that block:

The scheduled job will throw on a lazy load (WorkflowScanJob, inline). run() is not transactional, so accountRepository.findAll() hands back detached entities, and account.getRepos() then trips a LazyInitializationException before scanRepo is ever reached. repo.getAccount().getLogin() inside scanRepo has the same problem since the repo arrives detached. The parser tests and the controller test never go through this path, so it passes CI and fails the moment the hourly job fires. CodeRabbit flagged this as critical and it is right. Easiest fixes: make run() @transactional, or load accounts with their repos via @EntityGraph or a JOIN FETCH query, or pass repo ids into a transactional service method and reload inside.
The workflow content fetch hits the wrong host (WorkflowScanService, inline). download_url from the contents API is an absolute raw.githubusercontent.com URL, but GitHubClient.get builds the request with .uri(builder -> builder.path(uri).build()) against the https://api.github.com base, so the absolute URL gets treated as a path under api.github.com. So even once the lazy load is fixed, no YAML actually comes back. Either fetch the content from the contents API response directly (it returns a base64 content field you can decode, no second call needed), or give GitHubClient a way to GET an absolute URL with a plain WebClient. Also worth noting these two together mean the feature has not run successfully end to end yet, so please verify against the compose stack with a seeded account before the next round.

The valid smaller ones, also still open:

Permission detection has gaps (WorkflowParser, inline). A top-level scalar permissions: write-all sets broadPermissions but leaves missingOidc true, so it gets both the broad penalty and a false missing-OIDC penalty even though write-all includes id-token: write. Job-level write-all / read-all is never counted as broad either. While you are in there, the map heuristic on line 36 (anyMatch value == write and size > 3) is arbitrary and will miss real broad grants, that is worth rethinking, not just patching.
Stale scans are never cleared when a repo loses all its workflows. The empty-files early return happens before deleteByRepoId, so old rows linger and keep showing up in the latest-scan query. Move the delete above the empty check.
WorkflowParserTest.emptyYamlReturnsZeroScore asserts a score of 20, not zero. Rename it so the name matches what it checks, something like emptyYamlScoresOnlyMissingOidc.
The worker test config leaves scheduling on, so WorkflowScanJob can fire during the integration test context. Add spring.task.scheduling.enabled: false to the worker test application.yml to keep tests deterministic.

The one I would push back on:

CodeRabbit calls new Yaml().load() a major security hole on the grounds that Spring Boot 3.5.3 ships SnakeYAML 1.33. That premise is wrong. Boot 3.5.3 manages SnakeYAML 2.x, and 2.x is safe by default: the default loader rejects global tags, so arbitrary class instantiation from untrusted YAML is already blocked. So this is not a blocker and not for the reason given. I would still do two small things as plain hygiene: construct with a SafeConstructor so the intent is explicit and you are protected if the version ever moves, and wrap load() so a malformed workflow file throws something you handle rather than bubbling up. Neither is required to merge.

So to answer the actual question: no, the valid CodeRabbit comments are not fixed. Items 1 through 4, 6 and 7 all still stand in the code. Get the two runtime bugs sorted and the four cleanups in, and this is close. Happy to look again right after.

…or WorkflowScanJob

- replace arbitrary map heuristic with explicit PermissionStatus model - fix write-all scalar permission to correctly grant id-token: write - extend broad permission and OIDC checks to include job-level scopes

nakberli841-bot self-assigned this Jun 26, 2026

nakberli841-bot marked this pull request as draft June 26, 2026 19:39

nakberli841-bot added the area: backend Backend service or library label Jun 26, 2026

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

nakberli841-bot added 16 commits June 26, 2026 13:36

Merge branch 'main' into feat/10-github-actions-workflow-audit

664e589

fix(backend): merge conflict

3fdd3f4

Update WorkflowScanRepositoryTest.java

86e40b7

Update WorkflowScanRepositoryTest.java

7b87695

Update WorkflowScanRepositoryTest.java

300d4d0

fix(worker): add githubclient package to component scan in CleatWorke…

a2bcfed

…rApplication

fix: resolve missing repository imports in tests

b4786e8

feat: add installationId to AccountEntity and fix migration versioning

af8216c

feat: add installationId to Account entity and update database schema

1748dc5

- Added installationId field (String) to AccountEntity - Updated V1 migration script to include installation_id column - Fixed compilation and consistency issues across modules

fix: resolve checkstyle violations

a0cbe57

fix: resolve CI build failures in test modules

7787e1f

- Fix JUnit Platform launcher dependency in :libs:scanning - Resolve NoUniqueBeanDefinitionException in :apps:worker test context - Update API integration tests to match updated AccountEntity structure

fix(worker): narrow EntityScan and EnableJpaRepositories to specific …

622e666

…packages

fix(scanning): replace invalid SHA in WorkflowParserTest with valid h…

48c3564

…ex SHA

fix(github-client): mark redisTemplate bean as @primary to resolve No…

428728a

…UniqueBeanDefinitionException

fix(worker): add test application.yml with required GitHub and scan p…

eca70f6

…roperties

fix(worker): fix YAML syntax in test application.yml

035e1e2

nakberli841-bot marked this pull request as ready for review June 28, 2026 16:00

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

martian56 requested changes Jun 28, 2026

View reviewed changes

nakberli841-bot added 5 commits June 28, 2026 11:00

fix: resolve LazyInitializationException by enabling @transactional f…

d4d489b

…or WorkflowScanJob

fix: retrieve workflow content from base64 field to avoid host mismatch

ba5ba02

fix: delete old scan records before checking for empty workflow files

8b59d84

refactor(scanning): improve permission and OIDC audit logic

269ac17

- replace arbitrary map heuristic with explicit PermissionStatus model - fix write-all scalar permission to correctly grant id-token: write - extend broad permission and OIDC checks to include job-level scopes

refactor(scanning): align test expectations with new OIDC logic

e431f7a

		Yaml yaml = new Yaml();
		Map<String, Object> workflow = yaml.load(yamlContent);

Uh oh!

Conversation

nakberli841-bot commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this change?

Description

What was done

Tests

Related modules

Related issue

Area

Screenshots

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

martian56 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nakberli841-bot commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading