feat(net): reduce sync memory via lazy parsing and throttling by xxo1shine · Pull Request #6717 · tronprotocol/java-tron

xxo1shine · 2026-04-28T09:34:49Z

What does this PR do?

Cap memory usage during block sync by deferring block deserialization and bounding in-flight block requests.

New UnparsedBlock carrier — lightweight (BlockId, byte[]) pair (framework/.../net/service/sync/UnparsedBlock.java). Holds the parsed block id (cheap) plus the raw protobuf bytes; the block body is not deserialized until the worker thread is ready to process it. Implements equals/hashCode on blockId so it works as a map key.
Defer deserialization in sync queues — in SyncService:
- blockWaitToProcess and blockJustReceived change from Map<BlockMessage, PeerConnection> to Map<UnparsedBlock, PeerConnection>
- processBlock(peer, blockMessage) builds the UnparsedBlock from blockMessage.getData() instead of inserting the fully-parsed message
- handleSyncBlock parses the bytes back into a BlockMessage only when the block is about to be applied
Throttle in-flight block requests via node.maxPendingBlockNum — new config controlling the total budget of requested + justReceived + waitToProcess blocks across all peers:
- Default 500, clamped to [50, 2000] in NodeConfig.postProcess()
- Wired through NodeConfig.maxPendingBlockNum → Args.applyNodeConfig → CommonParameter.maxPendingBlockNum
- Default mirrored in reference.conf; example documented in config.conf
startFetchSyncBlock becomes budget-aware — computes remainNum = maxPendingBlockNum - requested - justReceived - waitToProcess; once the budget is
exhausted it stops requesting new heights, with one exception: blocks at or below maxRequestedBlockNum (the previously-requested ceiling) are still allowed
through, so a slow peer can be retried without stalling the whole sync.
Tests — SyncServiceTest updated to construct UnparsedBlock carriers in place of raw BlockMessage puts, plus boundary-condition coverage for the throttle
path.

Why are these changes required?

Sync-time heap was unbounded. Every received block was parsed eagerly into a BlockMessage and held in two maps until processing finished. Each BlockMessage keeps the parsed Block proto with every transaction expanded into Java objects, which inflates the on-heap footprint several times over the wire size. During catch-up sync from many peers, this routinely caused multi-GB heap growth and GC pressure, with documented OOM incidents on smaller-RAM nodes.
UnparsedBlock collapses the worst case. Raw bytes are 1–2 MB per block; the parsed in-memory representation is significantly larger. Keeping the bytes raw until processing defers (and amortizes) the parse cost to the worker that actually needs it, and frees the in-flight buffer to grow proportional to bytes, not parsed-object-graph.
Concurrent peer fetching had no global cap. A peer aggressively shipping inventory could push the blockJustReceived map far past safe levels, because the receiver only tracked per-peer MAX_BLOCK_FETCH_PER_PEER. Bounding requested + justReceived + waitToProcess makes the worst-case in-flight memory predictable: maxPendingBlockNum × avg block size.
The maxRequestedBlockNum exemption avoids deadlock. A pure hard cap could stall sync if every peer holding a needed block goes idle and the budget is full —
the only way to make progress is to retry the height, which requires re-requesting a block already in the budget. Allowing retries within the existing ceiling keeps sync forward-progressing under transient peer flakiness.
Operators need the knob. Default 500 is conservative for nodes with ≤ 8 GB heap; large/dedicated nodes benefit from raising it for higher sync throughput, and constrained environments can lower it. Surfacing it as node.maxPendingBlockNum (rather than a constant) lets operators tune without a release.

This PR has been tested by:

Unit Tests
Manual Testing

Follow up

Extra details

Fixes #6685

… and throttling in-flight requests

waynercheung · 2026-04-30T03:29:28Z


+  private final int maxPendingBlockNum = Args.getInstance().getMaxPendingBlockNum();
+
+  private static volatile long maxRequestedBlockNum = 0;


[MUST] This should be an instance field, not static.

Every other piece of sync state on this class (blockWaitToProcess, blockJustReceived, maxPendingBlockNum) is instance-scoped — only this one is static, with no apparent reason. Any scenario that recreates the Spring context (integration tests, hot reload, future multi-instance use) will leak state across instances.

Direct evidence: SyncServiceTest has to reset it via reflection (SyncServiceTest.java:185-187) to avoid pollution between test cases.

Suggested change:

private volatile long maxRequestedBlockNum = 0;

The reflection reset in the test can then be removed as well. If only the single-threaded fetchExecutor writes to it after this change, volatile can also be dropped.

waynercheung · 2026-04-30T03:31:21Z

-            processSyncBlock(msg.getBlockCapsule(), peerConnection);
-            peerConnection.getSyncBlockInProcess().remove(msg.getBlockId());
+            BlockCapsule block;
+            try {


[MUST] On failure this branch only logs and returns, which leaves three problems:

unparsedBlock stays in blockWaitToProcess - next tick re-parses, re-fails, and the entry permanently consumes one maxPendingBlockNum slot.

peer.getSyncBlockInProcess() is not cleared for this blockId.

invalid(blockId, peerConnection) is not called, so the peer is not penalised and the block is not rescheduled to another peer.

The trigger probability is very low (bytes were already parsed once in the BlockMessage ctor), but the cleanup should mirror the isDisconnect() branch directly above:

} catch (Exception e) { logger.warn("Deserialize block {} failed", blockId.getString(), e); blockWaitToProcess.remove(unparsedBlock); peerConnection.getSyncBlockInProcess().remove(blockId); invalid(blockId, peerConnection); return; }

waynercheung · 2026-04-30T03:36:52Z

    });
  }

  private synchronized void handleSyncBlock() {


[SHOULD] Missing wake-up safety net when the throttle saturates.

When the budget is full, startFetchSyncBlock returns having issued zero requests, and fetchFlag has already been consumed (set to false). Re-arming it depends on a later event from processBlock / invalid. If no new block arrives and no peer disconnects in that window, the next fetch has to wait for the cron tick and an external event.

Since handleSyncBlock is exactly the place that frees budget, the cheapest fix is to set fetchFlag = true; at the end of the if (isFound[0]) { ... } block (after processSyncBlock). Negligible cost, removes the corner case.

waynercheung · 2026-04-30T03:38:14Z

+    }
+    method.invoke(service);
+    // highBlockId must NOT be requested: remainNum <= 0 and num > maxRequestedBlockNum
+    Assert.assertNull(peer.getSyncBlockRequested().get(highBlockId));


[SHOULD] Missing test for the maxRequestedBlockNum exemption path.

Current coverage only asserts the hard-cap case (budget exhausted + num > maxRequestedBlockNum -> block not requested). The PR's deadlock-avoidance design specifically allows budget exhausted + num <= maxRequestedBlockNum to go through as a retry, and that path is untested - a future refactor could break it silently.

Suggestion: add a symmetric case where maxRequestedBlockNum is set to e.g. 100, blockWaitToProcess is filled to maxPendingBlockNum, the target blockId.num = 50, and assert peer.getSyncBlockRequested().get(blockId) != null.

feat(sync): reduce memory pressure by deferring block deserialization…

76412e7

… and throttling in-flight requests

github-actions Bot requested a review from 317787106 April 28, 2026 09:35

xxo1shine changed the title ~~feat(net): reduce sync memory via lazy block parsing and request throttling~~ feat(net): reduce sync memory via lazy parsing and throttling Apr 28, 2026

halibobo1205 added this to the GreatVoyage-v4.8.2 milestone Apr 28, 2026

halibobo1205 added the topic:net p2p net work, synchronization label Apr 28, 2026

lvs0075 requested a review from waynercheung April 29, 2026 12:46

waynercheung reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(net): reduce sync memory via lazy parsing and throttling#6717

feat(net): reduce sync memory via lazy parsing and throttling#6717
xxo1shine wants to merge 1 commit intotronprotocol:developfrom
xxo1shine:feat/sync-block-memory-optimization-1

xxo1shine commented Apr 28, 2026 •

edited

Loading

Uh oh!

waynercheung Apr 30, 2026

Uh oh!

waynercheung Apr 30, 2026

Uh oh!

waynercheung Apr 30, 2026

Uh oh!

waynercheung Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		private final int maxPendingBlockNum = Args.getInstance().getMaxPendingBlockNum();

		private static volatile long maxRequestedBlockNum = 0;

Conversation

xxo1shine commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

waynercheung Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

waynercheung Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

waynercheung Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

waynercheung Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xxo1shine commented Apr 28, 2026 •

edited

Loading