Skip to content

fix(plugins): exclude transitive leveldbjni-all from :crypto#6738

Open
barbatos2011 wants to merge 6 commits intotronprotocol:developfrom
barbatos2011:fix/plugins-leveldbjni-conflict
Open

fix(plugins): exclude transitive leveldbjni-all from :crypto#6738
barbatos2011 wants to merge 6 commits intotronprotocol:developfrom
barbatos2011:fix/plugins-leveldbjni-conflict

Conversation

@barbatos2011
Copy link
Copy Markdown
Contributor

Summary

x86_64 builds of Toolkit.jar since #6637 throw NoSuchMethodError on org.iq80.leveldb.Options.maxBatchSize(I) when running commands such as db archive / --rebuild-manifest:

Exception in thread "main" java.lang.NoSuchMethodError:
  org.iq80.leveldb.Options.maxBatchSize(I)Lorg/iq80/leveldb/Options;
    at org.tron.plugins.DbArchive$ArchiveManifest.newDefaultLevelDbOptions(DbArchive.java:137)
    ...

This PR fixes the root cause and adds CI smoke tests so the same class of bug cannot reach a release artifact again.

Why

#6637 introduced implementation(project(":crypto")) to the plugins module without excluding org.fusesource.leveldbjni:leveldbjni-all, which :crypto pulls in transitively via :common -> :platform. The direct :platform dependency in plugins/build.gradle already excludes leveldbjni-all (so the TRON-fork io.github.tronprotocol:leveldbjni-all:1.18.2 can be used instead), but the new transitive path through :crypto does not. The result on x86 is two jars carrying org/iq80/leveldb/Options.class on the runtime classpath:

  • org.fusesource.leveldbjni:leveldbjni-all:1.8 — does NOT have Options.maxBatchSize(int)
  • io.github.tronprotocol:leveldbjni-all:1.18.2 — DOES have Options.maxBatchSize(int)

:plugins:dependencyInsight confirms the regression path:

org.fusesource.leveldbjni:leveldbjni-all:1.8
\--- project :platform
     +--- runtimeClasspath
     \--- project :common
          \--- project :crypto
               \--- runtimeClasspath

Why tests didn't catch it

  • :plugins:test keeps source jars separate; the JVM classloader returns the first match in classpath order, which is 1.18.2 (declared directly by plugins/build.gradle). Tests pass.
  • The fat-jar binaryRelease task merges class entries with last-write-wins (Gradle Zip default INCLUDE), so the surviving Options.class can be the 1.8 copy. Same classpath, opposite winner.
  • ARM64 declares only :platform directly (no 1.18.2 fork), so there is no conflict; ARM64 also excludes Archive tests.
  • The CI matrix builds the fat jar but never executes it — Toolkit-specific runtime errors are not exercised.

Fix

  1. plugins/build.gradle — exclude org.fusesource.leveldbjni:leveldbjni-all from the :crypto dependency, mirroring the existing exclusion on the direct :platform dependency. This collapses the x86 runtimeClasspath to a single leveldbjni-all (1.18.2).
  2. .github/workflows/pr-build.yml — after each platform build (macos / ubuntu-arm / rockylinux-x86 / debian11-x86), run a short smoke test against the freshly built Toolkit.jar:
    java -jar Toolkit.jar help
    java -jar Toolkit.jar db --help
    java -jar Toolkit.jar db archive -h     # directly exercises the failing path
    java -jar Toolkit.jar keystore --help   # exercises the :crypto path added by #6637
    
    Each smoke step adds ~10–15s; jobs run in parallel so PR critical path increases by at most ~15s.

Verification

Reproduced on x86_64 + JDK 8 in eclipse-temurin:8-jdk (clean build):

  • Before fix: java -jar Toolkit.jar db archive -d <empty>NoSuchMethodError
  • After fix: java -jar Toolkit.jar db archive -d <empty> → exits 0, reports "directory does not contain any database"
  • :plugins:dependencies --configuration runtimeClasspath no longer lists org.fusesource.leveldbjni:leveldbjni-all:1.8
  • ./gradlew checkstyleMain checkstyleTest passes
  • ./gradlew :plugins:test passes (KeystoreUpdate, DbLite, RocksDb tests)

Test plan

  • CI passes on all four platform builds
  • New "Toolkit jar smoke test" step is green on each platform
  • Reviewer manually confirms java -jar Toolkit.jar db archive -d <db> works on x86 with this branch's artifact

Notes

  • No production Java changes; only plugins/build.gradle (+6 lines) and .github/workflows/pr-build.yml (+36 lines).
  • ARM64 path is unaffected: the :platform direct dependency on ARM64 has no exclusion (it never had the conflict), and the new :crypto exclusion is a no-op there.
  • The two org/iq80/leveldb/Options.class entries that still exist in the fat jar (from leveldbjni-all:1.18.2 uber and leveldb-api:1.18.2 standalone) are pre-existing and harmless because both are the same 1.18.2 version with maxBatchSize available.

Barbatos added 2 commits May 1, 2026 10:20
Toolkit.jar built on x86_64 throws NoSuchMethodError on
org.iq80.leveldb.Options.maxBatchSize(I) at runtime when running
`db archive` / `--rebuild-manifest`.

Root cause: the :crypto module added in PR tronprotocol#6637 brings in
:common -> :platform transitively, which pulls in
org.fusesource.leveldbjni:leveldbjni-all:1.8. The direct :platform
dependency on x86 already excludes leveldbjni-all (so the
io.github.tronprotocol fork at 1.18.2 can be used instead), but the
transitive path through :crypto does not. Both jars contain
org/iq80/leveldb/Options.class; in the fat jar the duplicate-entry
write order leaves the 1.8 copy, which lacks Options.maxBatchSize(int).

Why tests didn't catch it:
- :plugins:test classpath has both jars side-by-side; the JVM
  classloader returns the first match, which is 1.18.2 (declared
  directly), so tests pass.
- The fat jar binaryRelease task uses zipTree to merge classes;
  duplicate entries are overwritten by later writes, so the surviving
  Options.class can be the 1.8 copy. Same classpath, opposite winner.
- ARM64 path declares only :platform directly (no 1.18.2), so there is
  no conflict; Archive tests are also excluded on ARM64.

Fix: exclude org.fusesource.leveldbjni:leveldbjni-all from the :crypto
implementation, mirroring the existing exclusion on the direct
:platform dependency. After this change the x86 runtimeClasspath
contains only io.github.tronprotocol:leveldbjni-all:1.18.2, and the
fat jar's Options.class always exposes maxBatchSize(int).

Verified on x86_64 + JDK 8 (eclipse-temurin:8-jdk container):
- :plugins:dependencies no longer lists leveldbjni-all 1.8
- java -jar Toolkit.jar db archive -d <empty> exits 0 with the
  expected "directory does not contain any database" message
- :plugins:test passes
After `./gradlew clean build` finishes, run the freshly built
Toolkit.jar through a few picocli subcommands so that runtime
class-loading errors in the fat jar are caught by CI rather than
slipping into a release.

Without this step, dependency-conflict bugs that survive `:plugins:test`
can still ship in the fat jar, because:
- the test classpath keeps source jars separate (classloader returns
  the first match), but
- the fat jar merges class entries with last-write-wins, so a
  different copy of a duplicated class can end up in the artifact.

The recent NoSuchMethodError on Options.maxBatchSize is exactly this
shape of bug: tests passed, fat jar broke. The smoke test catches it.

Smoke commands run on every platform build (macos / ubuntu-arm /
rockylinux-x86 / debian11-x86):
- `java -jar Toolkit.jar help` — exercises the top-level command tree
- `java -jar Toolkit.jar db --help` — loads all db subcommands
- `java -jar Toolkit.jar db archive -h` — directly exercises the
  leveldb Options path that previously broke
- `java -jar Toolkit.jar keystore --help` — exercises the :crypto
  module path added by the recent keystore migration

Each step adds ~10-15s; jobs run in parallel so PR critical path
increases by at most ~15s.
Comment thread plugins/build.gradle Outdated
// declared below on x86. Both jars contain org/iq80/leveldb/Options.class;
// in the fat jar the duplicate-entry write order can leave the 1.8 copy,
// which lacks Options.maxBatchSize(int) and breaks `db archive` at runtime.
exclude group: 'org.fusesource.leveldbjni', module: 'leveldbjni-all'
Copy link
Copy Markdown
Collaborator

@halibobo1205 halibobo1205 May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix itself is correct, but the excludes aren't aligned.

Two ways to make this consistent — pick one:

  • A. Add the other two excludes on :crypto — if plugins should NOT bundle zksnark / commons-io.
  • B. Remove the other two excludes on :platform — if plugins actually NEEDS zksnark / commons-io.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — verified via :plugins:dependencyInsight that both zksnark-java-sdk and commons-io were indeed leaking through :crypto -> :common -> :platform. Went with option A to preserve the existing :platform exclusion intent. Pushed 69d7f2156 aligning all three excludes on :crypto. Re-verified on x86_64 + JDK 8: runtimeClasspath no longer contains any of the three, smoke still passes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: option A worked for leveldbjni-all but broke :plugins:test on x86_64 (rockylinux CI just failed). DbLiteTest boots a Spring context that loads ZksnarkInitService, which actually pulls in classes from both commons-io (FileUtils) and zksnark-java-sdk (LibrustzcashWrapper) at runtime. Once both :platform and :crypto excluded those two, x86 testRuntimeClasspath lost them entirely (ARM64 was fine because the ARM branch declares :platform without excludes).

So the :platform-side excludes for zksnark and commons-io are deduplication, not a 'kept out of plugins' intent — the artifacts must keep arriving via :crypto -> :common -> :platform. Reverted those two in 320da62c3 and added a comment explaining why the asymmetry is intentional. leveldbjni-all remains mirrored because that one is the actual classpath conflict. :plugins:test passes again on x86 docker and smoke is still green.

Comment thread plugins/build.gradle Outdated
Copy link
Copy Markdown
Collaborator

@halibobo1205 halibobo1205 May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binaryRelease task in plugins/build.gradle declares

dependsOn (project(':protocol').jar, project(':platform').jar)

but its body consumes outputs from :crypto:jar and :common:jar via runtimeClasspath. Gradle prints an implicit_dependency warning and disables execution optimizations to compensate; under parallel / incremental / partial-build scenarios this can lead to the fat jar being assembled before :crypto:jar or :common:jar is up-to-date, producing broken artifacts or build failures. Future Gradle versions will likely promote this to a hard error.

Recommend adding project(':crypto').jar and project(':common').jar to the dependsOn list.

The log

> Task :plugins:buildToolkitJar
Execution optimizations have been disabled for task ':plugins:buildToolkitJar' to ensure correctness due to the following reasons:
  - Gradle detected a problem with the following location: '/Users/boson/IdeaProjects/java-tron/crypto/build/libs/crypto-1.0.0.jar'. Reason: Task ':plugins:buildToolkitJar' uses this output of task ':crypto:jar' without declaring an explicit or implicit dependency. This can lead to incorrect results being produced, depending on what order the tasks are executed. Please refer to https://docs.gradle.org/7.6.4/userguide/validation_problems.html#implicit_dependency for more details about this problem.
  - Gradle detected a problem with the following location: '/Users/boson/IdeaProjects/java-tron/common/build/libs/common-1.0.0.jar'. Reason: Task ':plugins:buildToolkitJar' uses this output of task ':common:jar' without declaring an explicit or implicit dependency. This can lead to incorrect results being produced, depending on what order the tasks are executed. Please refer to https://docs.gradle.org/7.6.4/userguide/validation_problems.html#implicit_dependency for more details about this problem.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed locally — ./gradlew clean && ./gradlew :plugins:buildToolkitJar immediately fails with Cannot expand ZIP '.../crypto/build/libs/crypto-1.0.0.jar', exactly as you predicted. Pushed 0aebaffb3 adding :crypto:jar and :common:jar to the dependsOn list. After the fix the same command builds clean with --warning-mode all and no implicit_dependency warning. Thanks for catching this.

Barbatos added 3 commits May 1, 2026 16:11
Per review on PR tronprotocol#6738: the direct :platform dependency on x86 already
excludes leveldbjni-all, zksnark-java-sdk, and commons-io, but the
transitive :crypto -> :common -> :platform path was only excluding
leveldbjni-all. Verified via :plugins:dependencyInsight that
zksnark-java-sdk and commons-io were leaking back into x86
runtimeClasspath through :crypto.

Mirror the remaining two excludes on :crypto so both paths to :platform
agree on what is kept out of the plugins fat jar. This preserves the
intent of the existing :platform excludes rather than relaxing them.

Re-verified on x86_64 + JDK 8:
- runtimeClasspath no longer contains leveldbjni-all 1.8,
  zksnark-java-sdk, or commons-io:commons-io
- java -jar Toolkit.jar db archive -d <empty> still exits 0
- :plugins:test, checkstyleMain, checkstyleTest all pass
…Release

Per review on PR tronprotocol#6738 (halibobo1205): the binaryRelease Jar task zips up
runtimeClasspath contents, which on this branch includes :crypto:jar and
:common:jar (introduced by tronprotocol#6637), but the task only declared dependsOn on
:protocol:jar and :platform:jar. Gradle was emitting an implicit_dependency
warning and disabling execution optimizations to compensate.

Reproduced locally:
  $ ./gradlew clean && ./gradlew :plugins:buildToolkitJar
  > Cannot expand ZIP '.../crypto/build/libs/crypto-1.0.0.jar' as it does
    not exist.

Add :crypto:jar and :common:jar to dependsOn so partial / parallel /
incremental builds can no longer race against missing dependency jars.

Verified:
- ./gradlew clean :plugins:buildToolkitJar passes with --warning-mode all
  and no implicit_dependency warning
- Smoke (db archive -d <empty>) still exits 0 on x86 docker
- :plugins:test, checkstyleMain, checkstyleTest all pass
CI on rockylinux (x86_64 + JDK 8) failed under 0aebaff:

  org.tron.plugins.rocksdb.DbLiteRocksDbTest > testToolsWithRocksDB FAILED
    org.springframework.beans.factory.BeanCreationException: Error creating
    bean with name 'zksnarkInitService' ... NoClassDefFoundError:
    org/apache/commons/io/FileUtils

After commit 69d7f21 mirrored the :platform excludes onto :crypto
(reviewer's option A in tronprotocol#6738), x86 testRuntimeClasspath lost
commons-io:commons-io and io.github.tronprotocol:zksnark-java-sdk
entirely: both arrived only via :crypto -> :common -> :platform, which
is now also excluding them. ARM64 was unaffected because the ARM64
branch declares :platform without exclusions.

Diagnosis: the DbLiteTest Spring boot path loads
org.tron.core.zen.ZksnarkInitService, which references both
org.apache.commons.io.FileUtils and
org.tron.common.zksnark.LibrustzcashWrapper at runtime. Reproduced on
x86_64 docker:

  - removing only commons-io exclude -> FAILED at LibrustzcashWrapper
  - removing both exclusions          -> testToolsWithRocksDB PASSED

The :platform-side excludes for these two artifacts are therefore
deduplication only, not a "kept out of plugins" intent. The
leveldbjni-all exclude is the only one that must be mirrored, because
that one is the actual classpath conflict.

This commit drops the zksnark-java-sdk and commons-io excludes from
:crypto and adds a comment recording why they are intentionally
asymmetric with :platform.

Verified on x86_64 + JDK 8 (eclipse-temurin:8-jdk):
- :plugins:test passes (was 4 failed under 0aebaff)
- runtimeClasspath still does NOT contain leveldbjni-all 1.8
- runtimeClasspath now contains commons-io and zksnark-java-sdk
- Toolkit.jar smoke (db archive -d <empty>) still exits 0
Comment thread plugins/build.gradle Outdated
// jars carry org/iq80/leveldb/Options.class; the duplicate-entry write
// order in the fat jar can leave the 1.8 copy, which lacks
// Options.maxBatchSize(int) and breaks `db archive` at runtime.
exclude group: 'org.fusesource.leveldbjni', module: 'leveldbjni-all'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe exclude just for x86_64?
pre :platform

    if (rootProject.archInfo.isArm64) {
        testRuntimeOnly group: 'org.fusesource.hawtjni', name: 'hawtjni-runtime', version: '1.18' // for test
        implementation project(":platform")
    } else {
        implementation project(":platform"), {
            exclude(group: 'org.fusesource.leveldbjni', module: 'leveldbjni-all')
            exclude(group: 'io.github.tronprotocol', module: 'zksnark-java-sdk')
            exclude(group: 'commons-io', module: 'commons-io')
        }
        implementation 'io.github.tronprotocol:leveldbjni-all:1.18.2'
        implementation 'io.github.tronprotocol:leveldb:1.18.2'
    }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — applied in 26f463b72. Wrapped the leveldbjni-all exclude on :crypto in if (!rootProject.archInfo.isArm64) to mirror the existing arch split on the :platform-direct declaration. Behavior is preserved (ARM64 still resolves leveldbjni-all 1.8 via :platform direct; x86 still resolves only the io.github.tronprotocol 1.18.2 fork) and the comment now reads as 'this fix is x86-specific' rather than a generic policy. Verified :plugins:test and the Toolkit smoke on both arches.

Per follow-up review on PR tronprotocol#6738 (halibobo1205): the
:platform-direct dependency in plugins/build.gradle is already split by
arch (`if (rootProject.archInfo.isArm64) ... else ...`). The
:crypto-side leveldbjni-all exclude is also only meaningful on x86 — on
ARM64 the only leveldbjni-all on the runtime classpath comes via the
direct :platform declaration at version 1.8, with no second copy to
conflict with — so the unconditional exclude was a no-op on ARM64.

Wrap the exclude in the same `if (!isArm64)` gate to match the existing
arch pattern in this file and to make intent explicit (the exclusion is
a fix for the x86-only fat-jar duplicate-class collision, not a
defensive policy on every architecture).

Behaviour-preserving on both arches:
- ARM64: runtimeClasspath still has org.fusesource.leveldbjni:leveldbjni-all:1.8
  via :platform direct (Gradle dedups the :crypto path); :plugins:test
  passes locally
- x86: runtimeClasspath still has only io.github.tronprotocol:leveldbjni-all:1.18.2;
  :plugins:test passes and `db archive -d <empty>` smoke exits 0
@kuny0707 kuny0707 requested a review from halibobo1205 May 1, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants