Implement core engine entry point and refactor Python inference by Eamon2009 · Pull Request #43 · Eamon2009/Quadtrix.cpp

Eamon2009 · 2026-05-17T13:17:44Z

# Description This PR introduces the primary entry point for the QUADTRIX engine in src/main.cpp. It establishes a unified workflow that handles model lifecycle management without relying on TorchScript, utilizing our custom internal headers for model architecture. # Key Features - Dual-Mode Execution: Integrated support for both a training loop and an interactive chat mode. - Infinite Generation: Implemented an unconstrained inference loop for continuous text generation. - C++ Architecture: Bypasses TorchScript to use custom-defined layers and headers, ensuring direct control over the execution graph. - Resource Management: only for CPU

# Description This PR synchronizes the model interaction logic across both the Python backend utilities and the web frontend. It establishes a consistent way to interface with the model weights and the C++ engine. ## Python Backend (inference.py) - Goal: Refactor the standalone inference script to support modern weight loading. - Weight Mapping: Updated to load and map .pt files directly using the refactored architecture. - Chat Mode: Implemented a robust interactive loop for rapid model testing and verification. ## Frontend Layer (frontend/src/api) - Goal: Establish the bridge between the UI and the Quadtrix engine. - Service Definition: Created the base API client to handle requests to the C++ backend. - Dual-Path Logic: Added handlers for both Training control and Inference/Chat endpoints. - Stream Support: Prepared the API layer to handle "generation" data chunks for real-time UI updates. ## other PR merge #7 #6 #5 #4 #3

## Summary <img width="2185" height="829" alt="run_20260430_192930" src="https://github.com/user-attachments/assets/420ebbb4-cadf-4408-bc69-fc32ad081c6f" /> ## Model Configuration | Parameter | Value | |---|---| | Layers | 6 | | Heads | 6 | | Embedding dim | 100 | | Block size | 190 | | Batch size | 64 | | Dropout | 0.2 | | Learning rate | 3e-4 | | Total parameters | **10,837,257** | ## Training Details | Field | Value | |---|---| | Steps | 8,000 | | Eval every | 200 steps | | Optimizer seed | 1337 | | Train tokens | 14,080,249 | | Val tokens | 1,564,473 | | Precision | bf16 | | MFU | 60.0% | ## Results | Metric | Value | |---|---| | Best val loss | **2.3918** | | Final train loss | 2.2825 | | Total loss drop | 8.57 | | Peak throughput | 19,602 tok/s | | Mean throughput | 18,756 tok/s | | Peak grad norm | 2.2504 | | Mean grad norm | 1.6894 | | Training time | **82m 43s** | | Checkpoint | `best_model.pt` |

…#30) ## Summary Publish GitHub Package using npm ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes

## Summary benchmarks c++ for performance test ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes - [ ] Docs were updated

…38) ## Summary benchmarks c++ for performance test ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes - [ ] Docs were updated

## Summary docs improvement with chat images ## Checks C++ build still works Backend changes were smoke-tested locally Frontend build still passes Docs or screenshots were updated if needed

Introduces configuration for real C++ and Python Quadtrix benchmark runs, including warmup, token generation, and training step dimensions.

## Summary Introduces configuration for real C++ and Python Quadtrix benchmark runs, including warmup, token generation, and training step dimensions.

## Summary Introduces a CLI tool to load, index, and align benchmark JSON results from both backends. It displays a side-by-side comparison table showing latency (ms), throughput (tokens/s), and the percentage speedup/slowdown.

#40) Introduces a CLI tool to load, index, and align benchmark JSON results from both backends. It displays a side-by-side comparison table showing latency (ms), throughput (tokens/s), and the percentage speedup/slowdown.

suite Introduces a standard entry point script that invokes the core python_benchmark module execution flow.

## Summary execution wrapper for Python runner Adds a boilerplate compatibility script to handle safe system exits and execution routing for python benchmark.

Introduces the primary Python benchmark runner, measuring model metadata, data throughput, forward latency, training-step latency, and autoregressive generation. Includes utility functions for dynamic module loading, timing, and percentile calculation.

## Summary Introduces the primary Python benchmark runner, measuring model metadata, data throughput, forward latency, training-step latency, and autoregressive generation. Includes utility functions for dynamic module loading, timing, and percentile calculation. ## Model BenchmarkingLatency Profiling: Tracks forward pass, training step, and autoregressive generation latencies.Throughput Tracking: Measures tokenizer processing speeds and data throughput.Resource Monitoring: Captures model metadata and system memory footprints during runs. ## Math UtilitiesDynamic Loading: Implements safe runtime module loading via importlib to dynamically interact with engine/inference.py.Statistical Metrics: Adds custom mathematical utility functions, including a precise percentile calculator ($P_{50}$, $P_{90}$, $P_{99}$) for latency distribution reporting.Standardized Exports: Lays the groundwork for structured JSON and CSV output formatting.

Introduces the primary C++ benchmark runner (cpp_benchmark.cpp). It defines the parsing configurations, tracking metrics structures (Stats and BenchRow), and basic time/utility abstractions needed to mirror the Python benchmark suite capabilities.

codeaddict-119 and others added 18 commits May 1, 2026 10:08

updating (#9)

62a0b3d

Add GitHub Packages publish workflow

979212e

Add benchmarking, documentation updates, and GitHub Packages workflow (…

09d841d

…#30) ## Summary Publish GitHub Package using npm ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes

Revert dependency bump and enhance documentation and licensing (#31)

09232eb

## Summary benchmarks c++ for performance test ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes - [ ] Docs were updated

Revert dependency bump and enhance documentation and licensing (#31) (#…

5ce7cc8

…38) ## Summary benchmarks c++ for performance test ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes - [ ] Docs were updated

Implement core engine entry point and refactor Python inference (#37)

3062917

## Summary docs improvement with chat images ## Checks C++ build still works Backend changes were smoke-tested locally Frontend build still passes Docs or screenshots were updated if needed

feat: add reference benchmark dimensions for Quadtrix

124bdcf

Introduces configuration for real C++ and Python Quadtrix benchmark runs, including warmup, token generation, and training step dimensions.

add reference benchmark dimensions for Quadtrix (#39)

01ed329

## Summary Introduces configuration for real C++ and Python Quadtrix benchmark runs, including warmup, token generation, and training step dimensions.

feat: entry point for Python benchmark (#41)

2b94130

suite Introduces a standard entry point script that invokes the core python_benchmark module execution flow.

Entry point for Python benchmark (#41)

21d9654

## Summary execution wrapper for Python runner Adds a boilerplate compatibility script to handle safe system exits and execution routing for python benchmark.

Eamon2009 merged commit 720ffc1 into dependabot/npm_and_yarn/frontend/multi-bb2efd036b May 17, 2026
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement core engine entry point and refactor Python inference#43

Implement core engine entry point and refactor Python inference#43
Eamon2009 merged 18 commits into
dependabot/npm_and_yarn/frontend/multi-bb2efd036bfrom
master

Eamon2009 commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Eamon2009 commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants