Skip to content

build: containerize the server and publish to GHCR#22

Merged
JinBa1 merged 2 commits into
mainfrom
build/docker-ghcr
Jun 20, 2026
Merged

build: containerize the server and publish to GHCR#22
JinBa1 merged 2 commits into
mainfrom
build/docker-ghcr

Conversation

@JinBa1

@JinBa1 JinBa1 commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Ships the Spring Boot gateway (REST + MCP) as a container image — the self-host distribution channel: run the gateway next to your CSVs with no Java toolchain, and point an MCP client at it.

docker run --rm -p 8080:8080 -v /path/to/your/csvs:/cuckoodb/data ghcr.io/jinba1/cuckoodb:latest
# REST: http://localhost:8080/queries · MCP: http://localhost:8080/mcp

What's here

  • Multi-stage Dockerfile — build the server fat jar (+ engine) on a JDK image with a BuildKit cache mount for the Maven repo; run on a slim JRE as a non-root user. data-dir defaults to /cuckoodb (CSVs mount at /cuckoodb/data); uploads (when enabled) persist under /cuckoodb/work.
  • .dockerignore — small build context; excludes target/, .git, and internal/dev material.
  • docker-publish.yml — builds on every PR (validation, no push) and publishes to ghcr.io/<owner>/cuckoodb on pushes to main and v* tags, tagged via docker/metadata-action (branch, tag, sha, latest on main).
  • README — a concise "Run the server as a container" section.

Verification (built + ran locally)

docker build succeeds; docker run with a mounted CSV folder:

  • GET /tables["People"] (the env-bound data-dir loads the mounted CSVs)
  • POST /queries → typed columns/rows
  • /mcp mounted, "Registered tools: 5" in logs
  • app runs as non-root on port 8080

Notes

  • The image is the self-host channel; this is not a public deployment (that's the next slice, gated on the data-gravity / object-storage story).
  • First GHCR publish happens on merge to main; until then, docker build -t cuckoodb . builds locally.

Ship the Spring Boot gateway (REST + MCP) as a container image so it can be run
next to a user's data with no Java toolchain — the self-host distribution channel:
`docker run -v <csvs>:/cuckoodb/data -p 8080:8080 ghcr.io/jinba1/cuckoodb` and
point an MCP client at /mcp.

- multi-stage Dockerfile: build the server fat jar (+ engine) on a JDK image with a
  BuildKit cache mount for the Maven repo, then run it on a slim JRE as a non-root
  user. data-dir defaults to /cuckoodb (CSVs mount at /cuckoodb/data); uploads, when
  enabled, persist under /cuckoodb/work
- .dockerignore keeps the build context small and excludes internal/dev material
- docker-publish workflow builds on every PR (validation, no push) and publishes to
  ghcr.io/<owner>/cuckoodb on pushes to main and version tags, tagged via
  docker/metadata-action (branch, tag, sha, latest-on-main)
- README: a "Run the server as a container" section with the mount + REST/MCP endpoints

Verify by building and running the image: GET /tables loads the mounted CSVs, POST
/queries returns typed rows, /mcp registers the five tools, and the app runs as a
non-root user on port 8080.
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@JinBa1

JinBa1 commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

@claude review this PR

@claude

claude Bot commented Jun 20, 2026

Copy link
Copy Markdown

Claude finished @JinBa1's task in 2m 46s —— View job


PR Review: build: containerize the server and publish to GHCR

  • Read Dockerfile
  • Read .dockerignore
  • Read docker-publish.yml
  • Read README.md changes
  • Cross-checked env vars against application.properties
  • Reviewed server pom.xml for health endpoint availability

Overall this is a clean, well-thought-out PR. The multi-stage build, non-root runtime user, BuildKit cache mounts, and pinned workflow action versions are all correct. One actionable issue and a few minor suggestions below.


Issues

[Medium] GHCR login runs on fork PRs — will fail

.github/workflows/docker-publish.yml line 29:

- name: Log in to GHCR
  uses: docker/login-action@v3
  with:
    registry: ghcr.io
    username: ${{ github.actor }}
    password: ${{ secrets.GITHUB_TOKEN }}

For PRs opened from forks, GitHub restricts the GITHUB_TOKEN to read permissions only, regardless of what permissions: declares. The login step will fail, breaking the build-validation check for any external contributor. The fix is the standard one-liner guard:

- name: Log in to GHCR
  if: github.event_name != 'pull_request'   # ← add this
  uses: docker/login-action@v3

Since push is already gated with push: ${{ github.event_name != 'pull_request' }}, skipping the login on PRs is safe — the build-only path needs no credentials. Fix this →


Suggestions

[Minor] No HEALTHCHECK in Dockerfile

Without a HEALTHCHECK, Docker and orchestrators (Compose, swarm) have no signal for whether the app is actually ready — the container is marked healthy the moment the process starts. Spring Boot Actuator isn't in the server's pom.xml yet, but /tables is always available once the catalog loads:

HEALTHCHECK --interval=30s --timeout=5s --start-period=20s --retries=3 \
  CMD curl -sf http://localhost:8080/tables || exit 1

Note: curl may not be present in the slim eclipse-temurin:17-jre image; you'd need to install it in the runtime stage (apt-get install -y --no-install-recommends curl). Alternatively, adding spring-boot-starter-actuator gives you /actuator/health with richer semantics (startup vs. liveness vs. readiness). This is optional for now but worth tracking.

[Minor] JVM memory configuration undocumented

Java 17 respects cgroup memory limits by default (UseContainerSupport is on), so the JVM won't blindly grab host memory. However, users who want to tune heap size or GC settings have no obvious entry point. The JVM honours JAVA_TOOL_OPTIONS automatically — a one-liner in the README pointing this out would help:

docker run -e JAVA_TOOL_OPTIONS="-Xmx512m" ... ghcr.io/jinba1/cuckoodb:latest

No Dockerfile change needed; it's a documentation gap.

[Nitpick] Floating base image tags

eclipse-temurin:17-jdk and eclipse-temurin:17-jre are mutable tags — a new patch release can silently change the image. For full reproducibility, pin by digest:

FROM eclipse-temurin:17-jdk@sha256:<digest> AS build
FROM eclipse-temurin:17-jre@sha256:<digest> AS runtime

This is a supply-chain hygiene point; the risk is low given the trusted upstream. The docker/build-push-action provenance/SBOM flags (provenance: true, sbom: true) can complement this if supply-chain attestation matters down the line.


What looks good

  • Multi-stage build — JDK in the build stage, JRE-only in runtime; image stays lean.
  • BuildKit cache mount on /root/.m2 keeps rebuilds fast without baking Maven artifacts into a layer.
  • Non-root user (cuckoo) with a dedicated group and home dir — correct pattern.
  • Env var names (CUCKOODB_DATA_DIR, CUCKOODB_WORK_DIR) match Spring Boot's relaxed binding of cuckoodb.data-dir / cuckoodb.work-dir in application.properties.
  • Volume layout (/cuckoodb/data for CSVs, /cuckoodb/work for uploads) is consistent between the Dockerfile, application.properties comment, and README.
  • Workflow permissions (contents: read, packages: write) are correctly minimal.
  • GHA layer cache (cache-from/cache-to: type=gha) avoids redundant Maven + Docker layer rebuilds in CI.
  • .dockerignore correctly excludes target/, .git, IDE files, and test output directories.
  • README section is concise and covers the happy path, REST/MCP endpoints, and local-build fallback.

Login is only needed to push; gate it on non-PR events so fork PRs (where
GITHUB_TOKEN is read-only) run the build-validation path without it. Push is
already gated the same way, so the build-only PR path needs no credentials.
@JinBa1

JinBa1 commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Thanks for the review. Dispositions:

  • [Medium] GHCR login on fork PRs — fixed (54afdca). Added if: github.event_name != 'pull_request' to the login step. (Minor nuance: login authenticates with a read-only token and wouldn't strictly fail — it's the push that needs write, and that's already gated — but skipping login when no push happens is the correct idiom, so applied.)
  • HEALTHCHECK — deferred, tracked. A HEALTHCHECK needs curl in the slim JRE (bloat) or actuator. Cloud platforms do their own HTTP health checks, and spring-boot-starter-actuator (/actuator/health with readiness/liveness) belongs with the demo-deploy slice — tracked as a follow-up to land there rather than adding curl now.
  • JVM tuning doc (JAVA_TOOL_OPTIONS) — folding into the upcoming README rewrite rather than expanding the concise container section here.
  • Pin base image by digest — skipping. Floating 17-jdk/17-jre auto-receives security patches, which suits a portfolio project; digest-pinning trades that for bump maintenance. Low risk, as noted.

Everything else (multi-stage, non-root, cache mounts, env binding, minimal permissions) confirmed good.

@JinBa1 JinBa1 merged commit 432619c into main Jun 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants