Machine-learning workflows for rotational transmission error modeling in RV reducers, with a repository structure aimed at reproducible experiments, engineering-oriented documentation, and future physics-informed extensions.
This repository studies rotational transmission error (TE) in RV reducers used in industrial robotics.
The project combines:
- experimental TE datasets from a dedicated test rig;
- structured machine-learning baselines implemented in Python;
- reproducible training, validation, and campaign workflows;
- documentation aimed at both engineering use and future model extension;
- a roadmap toward more structured hybrid models and later full PINN work.
The repository name still reflects the long-term physics-informed direction, but the current implemented surface already includes practical feedforward, harmonic, periodic-feature, residual-harmonic, and tree-based baselines.
Transmission error is a key indicator for reducer accuracy, vibration behavior, and final robot joint positioning quality.
In this project, the goal is not only to fit data well. The goal is to build TE models that are:
- accurate on measured operating conditions;
- interpretable enough for engineering analysis;
- structured enough to support future TwinCAT / PLC-friendly deployment;
- extensible toward hybrid and later physics-informed formulations.
Implemented today:
- validated TE dataset processing and visualization utilities;
- feedforward TE regression training with PyTorch Lightning;
- structured baselines for harmonic regression, periodic-feature MLP, and residual-harmonic MLP workflows;
- tree-based baselines for comparison;
- one-batch validation checks and smoke-test utilities;
- batch campaign execution and artifact tracking;
- styled report generation and PDF validation tooling;
- recovered RCIM original-pipeline documentation and a repository-owned
near-literal execution copy under
scripts/paper_reimplementation/rcim_ml_compensation/recovered_original_workflow/; - a faithful RCIM Model-Bank Reproduction exact-model-bank reimplementation under
scripts/paper_reimplementation/rcim_ml_compensation/original_dataset_exact_model_bank/, with completed forward/backward paper-reference campaigns, archived per-targetPython + ONNXmodel banks, and populated benchmark tables for RCIM Tables2-5; - repository-owned TwinCAT/TestRig video-guide tooling for high-quality transcript extraction, evidence-driven snapshots, and OCR-assisted report synthesis through Google GenAI;
- a repository-owned LAN AI node path for remote
LM Studio,faster-whisper, andPaddleOCRintegration while keeping repository orchestration on the current workstation; - explicit LAN OCR compatibility handling for current
PaddleOCRversions, with clearer remote-node diagnostics instead of opaque OCR-side500crashes; - repository-owned per-video report generation for analyzed TwinCAT/TestRig video guides;
- a formalized remote-strong TwinCAT/TestRig video-analysis pipeline with tracked reruns, promoted canonical artifacts, and cross-video knowledge synthesis;
- dual
NotebookLMsource-package tracks for guide-local concept videos and repository-specific project videos; - repository-owned isolated-mode and Markdown validation tooling.
- a repository-owned remote LAN training campaign launcher that starts approved campaigns from the local terminal while executing the heavy training runtime on the stronger remote workstation and synchronizing the resulting artifacts back into the canonical local repository state through metadata-aware artifact recovery.
Planned or future work:
- broader sequence-aware models such as lagged-window, GRU, LSTM, and TCN families;
- additional hybrid TE model families;
- export and deployment hardening for production-oriented inference;
- full PINN formulation once the physics residual design is mature enough.
The repository uses three separate naming layers:
| Layer | Purpose |
|---|---|
RCIM Model-Bank Reproduction |
Paper-faithful harmonic model-bank reproduction formerly called Track 1. |
Model Development Waves |
Training and experimental model families, numbered from Wave 1 through Wave 6. |
TE Curve Verification Pipeline |
Offline curve evaluation, diagnostics, selection, visualization, and report generation formerly called Track 2. |
The current model-development sequence is:
- Wave 1: baseline models;
- Waves 2.1-2.3: temporal and harmonic-temporal models;
- Waves 3.1-3.3: residual-offset and curve-aware models;
- Waves 4.1-4.4: robust, probabilistic, mixture-density, and stateful models;
- Wave 5.1: harmonic-prior residual models;
- Wave 5.2: MMT/PINN-guided models;
- Wave 6: integrated multi-task and multi-head models.
The TE Curve Verification Pipeline uses CVP 1.1 through CVP 1.5 for its
curve-first reranking, curve-payload diagnostics, mean-centered decomposition,
offset-and-shape audit, and causal-offset feasibility modules. Historical
artifact IDs and paths containing track1 or track2 remain unchanged for
reproducibility.
The most important folders for a new user are:
scripts/Python entry points for training, reporting, and tooling.config/YAML configuration files for datasets, presets, and campaigns.data/simplified_dataset/Legacy validated TE curves retained for compatibility.data/original_dataset/Raw test-rig recordings used to reconstruct preprocessing and provenance.data/polished_dataset/Default direction-separated row-level dataset used by new training.output/Training runs, validation checks, smoke tests, campaigns, and registries.doc/Main human-authored documentation, guides, reports, and technical notes.reference/External reference material and imported codebases kept outside the main canonical workflow.models/paper_reference/rcim_track1/Curated RCIM Model-Bank Reproduction forward/backward paper-reference model archives produced by the faithful exact-model-bank reimplementation.reference/video_guides/source_bundle/Canonical Git-tracked TwinCAT/TestRig video source bundle, with large media files stored through Git LFS.
If you only want to get started, begin with:
-
Public Documentation:
https://xilab-robotics.github.io/Physics-Informed-Neural-Networks/
Before cloning on Windows, enable Git long-path support from an elevated PowerShell prompt:
git config --system core.longpaths trueThen clone the repository into a reasonably short path such as C:\Work.
conda create -y -n pinns_env python=3.12
conda activate pinns_env
python -m pip install --upgrade pip
python -m pip install torch --index-url https://download.pytorch.org/whl/cu130
python -m pip install -r requirements.txtThe default dataset location is configured in
config/datasets/transmission_error_dataset.yaml:
paths:
dataset_root: data/polished_dataset
dataset:
name: polished_datasetUse --dataset simplified_dataset on dataset-aware entry points when a legacy
five-feature run must be reproduced.
For a lightweight verification run:
conda run -n pinns_env python scripts/training/train_feedforward_network.py --config-path config/training/feedforward/presets/trial.yaml --dataset polished_datasetFor the default feedforward baseline:
conda run -n pinns_env python scripts/training/train_feedforward_network.pyArtifacts are written under:
output/training_runs/<model_family>/<run_instance_id>/
conda run -n pinns_env python scripts/training/train_feedforward_network.py --config-path config/training/feedforward/presets/best_training.yamlpython scripts/training/run_training_campaign.py.\scripts\campaigns\infrastructure\run_remote_training_campaign.ps1 `
-CampaignConfigPathList @("config\training\...\candidate_a.yaml","config\training\...\candidate_b.yaml") `
-CampaignName "remote_example_campaign" `
-PlanningReportPath "doc\reports\campaign_plans\YOUR_PLAN.md"The recovered original RCIM pipeline is preserved as a near-literal repository-owned copy here:
scripts/paper_reimplementation/rcim_ml_compensation/recovered_original_workflow/
The campaign-ready faithful reimplementation that reproduces RCIM paper
Tables 2-5 on the repository dataset lives here:
scripts/paper_reimplementation/rcim_ml_compensation/original_dataset_exact_model_bank/
Current accepted RCIM Model-Bank Reproduction model archives and benchmark tables live here:
models/paper_reference/rcim_track1/doc/reports/analysis/rcim_paper_reference/RCIM Paper Reference Benchmark.md
RCIM Model-Bank Reproduction is considered closed as the repository-owned paper-faithful full-bank
reproduction surface: both forward and backward grid-search campaigns were
run, the accepted archives were refreshed, and Tables 2-5 were repopulated.
The closure does not claim that every colored benchmark cell is green; later
all-green or restricted-dataset studies must be tracked as separate
optimization/comparison branches.
.\scripts\campaigns\wave1\run_wave1_structured_baseline_recovery_campaign.ps1.\scripts\campaigns\wave1\run_wave1_residual_harmonic_family_campaign.ps1python -B scripts/tooling/markdown/run_markdownlint.py
python -B scripts/tooling/markdown/markdown_style_check.py --fail-on-warningpython -B scripts/tooling/video_guides/analyze_video_guides.pyThe canonical source media for this workflow now lives under:
reference/video_guides/source_bundle/
python -B scripts/tooling/video_guides/extract_video_guide_knowledge.py --video-filter "Machine_Learning_2" --limit-videos 1python -B scripts/tooling/video_guides/extract_video_guide_knowledge.py --video-filter "Machine_Learning_2" --limit-videos 1 --transcript-provider lan --cleanup-provider lmstudio --report-provider lmstudio --ocr-provider lanBefore using the LAN path, complete:
For the tracked remote high-quality rerun, use the repository-owned launcher:
.\scripts\tooling\video_guides\run_remote_high_quality_video_rerun.ps1This launcher processes one video at a time, writes persistent runtime tracking, and stops on the first failing video instead of silently skipping ahead.
The remote-strong process and current campaign sum-up are documented in:
The remote node now uses its own dependency file:
python -m pip install -r scripts/tooling/lan_ai/requirements-lan-ai-node.txtFor local-only validation on the current workstation, keep the remote environment variables untouched and add:
[System.Environment]::SetEnvironmentVariable("LM_STUDIO_LOCAL_URL", "http://127.0.0.1:1234", "User")Then run the workflow with explicit local overrides, for example:
python -B scripts/tooling/video_guides/extract_video_guide_knowledge.py --video-filter "Machine_Learning_2" --limit-videos 1 --transcript-provider lan --cleanup-provider lmstudio --report-provider lmstudio --ocr-provider local --transcript-model tiny --cleanup-model "nvidia/nemotron-3-nano-4b" --report-model "nvidia/nemotron-3-nano-4b" --lan-ai-base-url "http://127.0.0.1:8765" --lm-studio-base-url "$env:LM_STUDIO_LOCAL_URL" --forceIf you are opening the repository for the first time, use this reading order:
- Project Usage Guide Main runnable-workflow reference.
- Documentation Index Entry point for guides, reports, and technical notes.
- LAN AI Node Server Setup Guide
Full Windows-first setup for the remote
LM Studioand LAN AI node workstation. - Model guides under
doc/guide/Best place to understand the implemented model families at a conceptual level. - Analysis reports under
doc/reports/analysis/Useful when you want deeper training or model-family interpretation.
Recommended guide entry points:
- Transmission Error Foundations Bundle
- Neural Network Foundations
- Training, Validation, And Testing
- TE Model Curriculum
- FeedForward Network
- Harmonic Regression
- Periodic Feature Network
- Residual Harmonic Network
feedforwardPoint-wise MLP baseline for TE regression.harmonic_regressionStructured harmonic baseline with explicit periodic bias.periodic_mlpHybrid model combining periodic features with neural regression.residual_harmonic_mlpStructured-plus-residual decomposition for TE prediction.treeTabular baselines for honest comparison against neural approaches.
The repository separates artifacts by workflow type instead of mixing everything into one flat output root.
Important locations:
output/training_runs/output/validation_checks/output/smoke_tests/output/training_campaigns/output/registries/families/output/registries/program/
This keeps run identity, campaign outcomes, and best-result tracking explicit and inspectable.
The repository keeps its canonical human-authored documentation under doc/.
Use these entry points instead of treating README.md as an internal registry:
The repository-owned Sphinx portal under site/ is now also prepared for
publication through the GitHub Pages workflow in:
.github/workflows/publish-sphinx-pages.yml
The live public portal is available at:
https://xilab-robotics.github.io/Physics-Informed-Neural-Networks/
For the current TwinCAT/TestRig video-analysis stack, the key references are:
- Remote High-Quality TwinCAT Video Pipeline
- Remote High-Quality TwinCAT Video Campaign Sum-Up
- TwinCAT Video Guides Reference
Recent repository-structure decisions are tracked in:
- 2026-04-02-14-40-15_skill_frontmatter_bom_compatibility_fix.md
- 2026-04-02-14-24-24_readme_landing_page_and_registry_separation_rule.md
- 2026-04-02-13-15-32_readme_tooling_lan_ai_documentation_reorganization.md
- 2026-04-02-13-04-40_move_closed_video_rerun_tracking_into_analysis_bundle.md
- 2026-04-02-12-42-24_lan_ai_node_ocr_500_regression_check_and_paddleocr_compatibility_fix.md
The near-term direction of the repository is to strengthen structured TE baselines, keep the training/reporting workflow reliable, and progressively move toward richer hybrid and eventually physics-informed models once the formulation is technically justified.