Skip to content
This repository was archived by the owner on Apr 22, 2026. It is now read-only.

Liupeter01/libHPC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

287 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

libHPC

libHPC is a high-performance computing library focused on Linux and Windows environments. It provides SIMD-optimized kernels, concurrent data structures, GPU utilities, and HPC-oriented memory management components.

Project Status

This public archive preserves the state of libHPC at the point where its core HPC primitives — GPU radix sort, ABA-safe lock-free queue, SIMD kernels, and cache-hierarchy benchmarks — reached a stable, validated milestone.

Active development continues privately. The archive is retained for study, reference, and portfolio purposes; commercial use, redistribution, or derivative proprietary work without explicit permission is not permitted.

Platform support status, known limitations, and benchmarks are documented in the sections below.

0x00 Platform Support

Platform Status
Linux (x86_64 / CUDA) ✓ Supported
Windows (MSVC / CUDA) ✓ Supported
macOS (Intel) ✓ Supported, limited
macOS (Apple Silicon / ARM64) Not supported

0x01 macOS Apple Silicon Notice

libHPC does not support macOS ARM (Apple Silicon).

The reason is simple:

Apple’s recent macOS / Xcode toolchain updates introduced ABI changes in libc++, causing oneTBB and other HPC components to fail at link-time.

Apple’s recent macOS / Xcode toolchain updates introduced ABI changes in libc++, causing oneTBB and other HPC components to fail at link-time.
Specifically, std::__1::__hash_memory, a critical dependency for oneTBB, has been removed/hidden at the SDK level. These issues do not occur on Linux or Windows, and they did not occur on older macOS versions.

Since the goal of libHPC is stable, reproducible high-performance computing, macOS ARM is excluded to avoid degraded reliability or performance.


0x02 GPU Performance Optimization Highlights

libHPC includes GPU-accelerated kernels optimized for high-throughput computation on NVIDIA CUDA-compatible devices:

  • Radix-Sort Kernel: Processes 500M elements in ~360ms on an RTX 3080 Ti(laptop), sustaining ~1.39B elements/sec throughput.
  • Warp-Synchronous & Tiled Memory Layouts: Maximizes shared memory utilization and minimizes global memory latency.
  • Concurrent GPU Pipelines: Supports asynchronous kernel launches and stream-based scheduling for overlapping compute and memory operations.
  • Profiling & Validation: Includes tools for warp efficiency, memory access analysis, and synchronization correctness across GPU architectures.
  • Realistic HPC Throughput: Designed for bulk-parallel computation and scientific workloads, not real-time ultra-low-latency trading systems.

About

libHPC is a high-performance computing library focused on Linux and Windows environments. It provides SIMD-optimized kernels, concurrent data structures, GPU utilities, and HPC-oriented memory management components.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors