Skip to content

Add online CUDA lookup and GPU/cuCollections benchmark scaffolding#83

Merged
tpn merged 38 commits into
mainfrom
78-online-cuda-gpu-bench
Apr 18, 2026
Merged

Add online CUDA lookup and GPU/cuCollections benchmark scaffolding#83
tpn merged 38 commits into
mainfrom
78-online-cuda-gpu-bench

Conversation

@tpn
Copy link
Copy Markdown
Owner

@tpn tpn commented Mar 30, 2026

Closes #78.

Summary

  • add online-JIT CUDA export APIs for emitted CUDA source, table payloads, and table metadata
  • add standalone GPU benchmark/examples for NVRTC-based lookup plus cuCollections static_map and static_multiset baselines
  • add TPC-H query-probe extraction utilities for real build/probe stream benchmarking
  • harden the new benchmarks and emitter paths based on repeated review feedback around verification, portability, runtime loading, and edge cases

Validation

  • cmake --build build-online-llvm-jit --config Release --target PerfectHashOnlineCore -j 8
  • python -m py_compile examples/tpch-query-probes/extract_tpch_query_probes.py examples/tpch-query-probes/partition_hot_subset.py
  • cmake --build /tmp/ph-cuco-map-build -j 8
  • cmake --build /tmp/ph-cuco-multiset-build -j 8
  • cmake -S examples/cpp-console-online-cuda-nvrtc -B /tmp/ph-online-cuda-build -DPERFECTHASH_GIT_REPOSITORY=file:///home/trentn/src/perfecthash -DPERFECTHASH_GIT_TAG=78-online-cuda-gpu-bench -DPERFECTHASH_BUILD_PROFILE=online-llvm-jit >/dev/null && cmake --build /tmp/ph-online-cuda-build -j 8

Review

  • repeated roborev branch reviews were run during the rebase/update cycle
  • the final daemon-backed Codex review path was blocked by provider auth issues (401 Unauthorized on gpt-5.4), so the last clean review pass I used was via claude-code, followed by the remaining low-severity cleanup nits incorporated directly

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: adc9fe4f5a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread examples/cpp-console-online-cuda-nvrtc/src/main.cpp
Comment thread src/PerfectHash/PerfectHashOnlineJit.c Outdated
@tpn tpn force-pushed the 78-online-cuda-gpu-bench branch from bf92a88 to ea72ebd Compare April 9, 2026 21:02
Copilot AI review requested due to automatic review settings April 18, 2026 01:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an “online CUDA export” surface to PerfectHash’s OnlineJit tables (CUDA source + table payload + metadata) and introduces standalone GPU benchmark/example drivers (NVRTC-based PerfectHash lookup plus cuCollections static_map/static_multiset baselines), along with TPC-H probe-stream extraction utilities for more realistic build/probe benchmarking.

Changes:

  • Add OnlineJit APIs to export generated CUDA lookup source, table payload bytes, and exported table metadata.
  • Add standalone GPU benchmark/example projects for NVRTC compilation + cuCollections baselines.
  • Add Python utilities + docs to extract and post-process TPC-H query-driven build/probe streams.

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/PerfectHash/PerfectHashOnlineJit.c Adds CUDA source/table export + table info API; adjusts compile fallback logic; adds Index32x2 wrapper.
include/PerfectHash/PerfectHashOnlineJit.h Exposes new public structs/flags and CUDA export APIs.
src/PerfectHash/PerfectHash.def Exports new OnlineJit CUDA/table-info symbols.
src/PerfectHash/Chm01FileWorkCudaSourceFile.c Hardens generated CUDA source output (optional “library-only” emission; seed sourcing changes).
examples/cpp-console-online-jit/src/main.cpp Adds CLI plumbing to dump generated CUDA source to stdout/file.
examples/cpp-console-online-jit/cmake/FindPerfectHashOnlineJit.cmake Broadens header discovery for different include layouts.
examples/tpch-query-probes/* Adds TPC-H probe extraction + hot/cold partitioning tools and README.
examples/cpp-console-online-cuda-nvrtc/* New NVRTC-based PerfectHash GPU lookup benchmark/example with CPU baseline + verification.
examples/cpp-console-cuco-static-map-bench/* New cuCollections static_map baseline benchmark + build scaffolding/docs.
examples/cpp-console-cuco-static-multiset-bench/* New cuCollections static_multiset baseline benchmark + build scaffolding/docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/PerfectHash/PerfectHashOnlineJit.c
Comment thread examples/tpch-query-probes/partition_hot_subset.py Outdated
Comment thread src/PerfectHash/PerfectHashOnlineJit.c Outdated
Comment thread src/PerfectHash/PerfectHashOnlineJit.c
@tpn tpn merged commit 9e713a2 into main Apr 18, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add online CUDA lookup and GPU/cuCollections benchmark scaffolding

2 participants