Enable C++ PPR Sampling by mkolodner-sc · Pull Request #556 · Snapchat/GiGL

mkolodner-sc · 2026-03-24T20:24:34Z

Scope of work done

PPR-Based Neighbor Sampling for GiGL

Adds C++ based PPR (Personalized PageRank) based neighbor sampling as an alternative to k-hop sampling in GiGL's distributed training stack, switching from the previous pythonic implementation.

C++ Kernel (`gigl-core/csrc/sampling/`)

Implements the Forward Push algorithm (Andersen et al., 2006) as PPRForwardPushState — a pybind11-wrapped C++ class that owns all hot-loop state (scores, residuals, queue, neighbor cache).
Python drives the async RPC neighbor fetches; C++ handles the residual push loop, keeping the asyncio event loop unblocked via run_in_executor.
gigl-core is a separately installable package (uv pip install -e gigl-core/) with a py.typed marker and .pyi stub for full mypy coverage.

Python Sampler (`gigl/distributed/dist_ppr_sampler.py`)

DistPPRNeighborSampler extends BaseGiGLSampler and integrates with the existing DistNeighborLoader / Graph Store infrastructure.
Output format: (seed_type, "ppr", neighbor_type) edge types with edge_index ([2, N] int64) and edge_attr ([N] float PPR scores).
max_fetch_iterations: Optional[int] = None caps RPC calls per batch; the algorithm continues to convergence using cached neighbor lists after the budget is exhausted.
num_neighbors_per_hop defaults to 1000 — high-degree hub nodes receive diminishing residual per neighbor, so capping the fetch has negligible effect on PPR accuracy.

Configuration

PPRSamplerOptions added to sampler_options.py alongside existing KHopNeighborSamplerOptions; pass via sampler_options= on DistNeighborLoader.

C++ Tests (`gigl-core/tests/ppr_forward_push_test.cpp`)

6 GoogleTest cases covering: initial queue drain, convergence, score absorption, residual distribution, cross-seed deduplication, and top-k limiting.
tests/CMakeLists.txt updated to auto-link torch and auto-discover kernel sources under csrc/ (excluding python_* entry points).

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

…rch includes

…make target

…t_install

…ignore

…targets

…names

…/GiGL into mkolodner-sc/cpp-infrastructure

…ppr_tracer

mkolodner-sc · 2026-04-29T21:34:54Z

/unit_test

github-actions · 2026-04-29T21:35:08Z

GiGL Automation

@ 21:35:08UTC : 🔄 C++ Unit Test started.

@ 21:37:04UTC : ✅ Workflow completed successfully.

github-actions · 2026-04-29T21:35:11Z

GiGL Automation

@ 21:35:10UTC : 🔄 Scala Unit Test started.

@ 21:44:18UTC : ✅ Workflow completed successfully.

github-actions · 2026-04-29T21:35:13Z

GiGL Automation

@ 21:35:13UTC : 🔄 Python Unit Test started.

@ 22:29:26UTC : ❌ Workflow failed.
Please check the logs for more details.

mkolodner-sc · 2026-04-29T22:44:31Z

            ref_score = reference_ppr[ntype_str][node_id]
            sam_score = ntype_to_sampler_ppr[ntype_str][node_id]
-            assert abs(sam_score - ref_score) < 1e-6, (
+            assert abs(sam_score - ref_score) < 1e-5, (


This needed to be updating 1e-6 -> 1e-5 for the margin of error. With the C++ conversions, my understanding is we have lost some float precision here, but the 1e-5 margin should still be sufficient for validating that the PPR algorithm is correct here.

It's a little odd we lost precision, have we looked into why?

I think it's fine to update this I'm just concerned if we don't know why.

robots have some thoughts about this, if it is floating point ordering then it's fine imo.

The tolerance in both assertions was loosened from 1e-6 (Python original) to 1e-5. The justification given in the comments at lines 273-275 and 375-378 — "theoretical bound is ~1.5e-6, so 1e-5 provides a safety margin" — is true for the new C++ implementation, but the Python implementation passed at 1e-6 with observed deltas around 1e-7. The C++ kernel is supposed to be mathematically equivalent. A 10x increase in observed error suggests the floating-point summation order changed — likely because unordered_map iteration order differs from Python dict insertion order, and the residual push at ppr_forward_push.cpp:178-225 sums in nondeterministic order.

The original pure-Python implementation used Python float (float64) for scores end-to-end. The C++ kernel keeps internal scores as double but extractTopK returns float32 tensors to match PyG's edge weight convention — that's should be the new source of precision loss. 1e-5 accounts for it.

Wait did we use to always use float64 / double? Or never convert to float?

Hmm, I believe we used to use doubles here. I've investigated this a bit further and tried changing all the floats to doubles in the cpp code, but it still is not within the 1e-6 tol. The robots came up with a potential reason similar to yours above:

The delta is purely a consequence of std::unordered_set iterating in a different order than CPython's set for small integer node IDs — different processing order within each push step leads to a different (but valid) residual distribution at convergence.

An example of this:

Iteration 2 queue = {1, 2}, both with residual 0.125 - Order: process 1 before 2 — when 1 pushes, it adds residual to 2 (bumping 2 to 0.15625). Then 2 pushes with 0.15625, so ppr[2] = 0.15625. - Order: process 2 before 1 — 2 pushes first with 0.125, so ppr[2] = 0.125. The extra 0.03125 from node 1 lands back in 2's residual for a later round.

Both paths converge to the same true PPR eventually, but when convergence is declared (residuals just drop below threshold), the absorbed ppr_hat values can differ by up to the remaining residual magnitude. C++ unordered_set ordering for int keys differs from CPython (both libraries have undefined behavior for ordering of an unordered object), giving a different residual distribution at termination — hence the ~1.38e-6 delta.

Given the theoretical bound is 1.5 e-6, we should probably have been setting it to 2e-6 in the first place. Running the tests with this leads to success.

kmontemayor2-sc

Took a first pass, thanks Matt.

I was leaving a good amount of perf-related comments but we probably don't need to address those.

I do think we should address the file structure and variable names, it's hard to know what pk is, for instance.

kmontemayor2-sc · 2026-05-01T21:10:34Z

+    for (int32_t s = 0; s < _batchSize; ++s) {
+        for (int32_t nt = 0; nt < _numNodeTypes; ++nt) {


Can we give these loop vars better names too? batchNode and nodeType maybe? nt is better but s is kinda magical (in a bad way)

Renamed throughout: s → seedIdx, nt → nodeTypeId, eid → edgeTypeId, dstNt → dstNodeTypeId, peid → neighborEdgeTypeId, pk → packedKey.

kmontemayor2-sc · 2026-05-01T21:53:22Z

            ref_score = reference_ppr[ntype_str][node_id]
            sam_score = ntype_to_sampler_ppr[ntype_str][node_id]
-            assert abs(sam_score - ref_score) < 1e-6, (
+            assert abs(sam_score - ref_score) < 1e-5, (


Wait did we use to always use float64 / double? Or never convert to float?

kmontemayor2-sc · 2026-05-01T21:56:24Z

+            # num_sampled_edges may not exist at all (e.g. in tests or when GLT doesn't
+            # populate it), and may lack entries for edge types with zero samples.
+            if hasattr(data, "num_sampled_edges"):
+                data.num_sampled_edges.pop(edge_type, None)


Hmmm, why did this only surface now?

mkolodner-sc requested review from kmontemayor2-sc, nshah-sc, svij-sc, xgao4-sc, yliu2-sc and zfan3-sc as code owners March 24, 2026 20:24

mkolodner-sc added 2 commits March 25, 2026 18:40

Add C++ infrastructure: build, format, lint, and unit test tooling

87e2bfd

Remove pybind11 import from generate_compile_commands — bundled in to…

a18fb47

…rch includes

mkolodner-sc force-pushed the mkolodner-sc/cpp_ppr_tracer branch from 4d2db2a to cfcb8cb Compare March 25, 2026 18:59

ppr seq

c127f2b

mkolodner-sc changed the base branch from main to mkolodner-sc/cpp-infrastructure March 25, 2026 19:27

mkolodner-sc added 12 commits March 25, 2026 19:51

Auto-build C++ extensions in post_install; auto-add LLVM to PATH on Mac

b5b8027

Rename setup.py to build_cpp_extensions.py; add build_cpp_extensions …

9d3c8df

…make target

Scope C++ extension discovery to gigl/cpp_extensions/

ace7126

Remove unnecessary existence check for build_cpp_extensions.py in pos…

48be4cc

…t_install

Review fixes + adopt PyTorch csrc conventions for C++ layout

1b153b7

Add multi-source C++ ext support, gigl/csrc package init, and .so git…

03ed8c4

…ignore

Initial commit

87dc95b

small precision fix

a231796

Optimize

a19db88

Add explanatory comments to ppr_forward_push.cpp for C++ newcomers

fed3815

Apply clang-format to ppr_forward_push.cpp

906df01

Move PPR C++ to gigl/csrc following PyTorch csrc conventions

dd118ef

mkolodner-sc force-pushed the mkolodner-sc/cpp_ppr_tracer branch from 7009d70 to dd118ef Compare March 25, 2026 21:58

mkolodner-sc added 6 commits March 25, 2026 22:21

Update

c66a6e5

Move build_cpp_extensions.py to scripts/ and wire into relevant make …

638e667

…targets

Initial commit

416f6b4

merge main

4af704e

Update

91d99d3

Update

3d41dc3

github-actions Bot and others added 5 commits April 27, 2026 23:32

[AUTOMATED] Update dep.vars, and other relevant files with new image …

a482bed

…names

Address comments

f40e94e

Merge branch 'mkolodner-sc/cpp-infrastructure' of github.com:Snapchat…

dba5580

…/GiGL into mkolodner-sc/cpp-infrastructure

merge main

3e26b41

Merge branch 'mkolodner-sc/cpp-infrastructure' into mkolodner-sc/cpp_…

42b396d

…ppr_tracer

Base automatically changed from mkolodner-sc/cpp-infrastructure to main April 29, 2026 01:58

mkolodner-sc added 6 commits April 29, 2026 17:55

Update

955a1fe

Add tests

a5ace2f

Update

97d9cbb

Update

efeaf1d

Update

37e967e

Update

bd3bd13

mkolodner-sc changed the title ~~PPR C++ Tracer Bullet~~ Enable C++ PPR Sampling Apr 29, 2026

mkolodner-sc added 5 commits April 29, 2026 20:51

Update

fb4d5f4

Update

a980fa6

Restore to flat

8b433dc

Update

0ed553a

Update

6c6c477

Update

70100a6

mkolodner-sc commented Apr 29, 2026

View reviewed changes

kmontemayor2-sc reviewed Apr 30, 2026

View reviewed changes

mkolodner-sc added 2 commits April 30, 2026 21:43

Update

fe26f5a

Update

0a468c0

kmontemayor2-sc reviewed May 1, 2026

View reviewed changes

mkolodner-sc added 2 commits May 1, 2026 23:33

Address comments

65a1590

fmt

2197fa5

		for (int32_t s = 0; s < _batchSize; ++s) {
		for (int32_t nt = 0; nt < _numNodeTypes; ++nt) {

Conversation

mkolodner-sc commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PPR-Based Neighbor Sampling for GiGL

C++ Kernel (gigl-core/csrc/sampling/)

Python Sampler (gigl/distributed/dist_ppr_sampler.py)

Configuration

C++ Tests (gigl-core/tests/ppr_forward_push_test.cpp)

Uh oh!

mkolodner-sc commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mkolodner-sc May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kmontemayor2-sc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mkolodner-sc commented Mar 24, 2026 •

edited

Loading

C++ Kernel (`gigl-core/csrc/sampling/`)

Python Sampler (`gigl/distributed/dist_ppr_sampler.py`)

C++ Tests (`gigl-core/tests/ppr_forward_push_test.cpp`)

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

mkolodner-sc May 1, 2026 •

edited

Loading