Tremayne Timms t-timms

Tremayne Timms

LLM Systems Engineer — Production Inference · Multi-Agent Orchestration · Edge Deployment

About Me

I build production LLM systems from the metal up — from quantized models running on Jetson edge hardware to multi-agent cloud deployments with tool-use, permission gating, and audit trails. I care about systems that are secure, measurable, and actually useful.

Dallas-Fort Worth, TX

Core stack: Python, Rust, PyTorch, CUDA, Docker, LiteLLM

What I'm Building

Godspeed

Security-first open-source coding agent. Hand-rolled async ReAct loop with a 4-tier deny-first permission engine, SHA-256 hash-chained audit trail, and 200+ LLM providers via LiteLLM.

30+ built-in tools with JSON Schema validation, MCP server/client
Parallel + speculative tool dispatch, cost budget enforcement
Self-evolution via LLM-guided mutations, multi-language verify gate
SWE-bench Lite: 52.2% oracle best-of-5

Sovereign Edge

Autonomous multi-agent personal intelligence system running on NVIDIA Jetson hardware. Fully on-device inference — zero cloud dependencies, privacy-preserving by design.

Manna Trading

Multi-agent algorithmic trading pipeline with DeepSeek R1 reasoning at every stage.

4-agent pipeline: Technical Analysis → Chief Strategist → Risk Manager → Execution
Kelly Criterion position sizing, Monte Carlo risk simulation
Real-time WebSocket market data, paper trading integration

Bible AI Assistant

Qwen3.5-4B fine-tuned with ORPO for biblical question-answering.

Hybrid RAG: dense embeddings + keyword search
Constitutional AI self-critique guardrails for theological accuracy
Voice pipeline: speech-to-text → LLM → text-to-speech

GPU Server Test Suite

Comprehensive GPU diagnostic toolkit modeled on NVIDIA DCGM architecture.

Automated stress testing, memory validation, ECC detection
Health monitoring for GPU server fleets

ML Experiment Scaffold

Production-grade ML training infrastructure for single-GPU homelabs.

Unsloth fp8 quantization, torch.compile graph optimization
DeepSpeed ZeRO stages, vLLM + lm-eval harness
Multi-seed reporting for statistically sound results

Manufacturing Quality Analytics

SQL + Python ETL pipeline for semiconductor quality analysis.

Supplier performance scoring with trend detection
Defect Pareto distributions, yield rate dashboards
Automated alerting on quality threshold breaches

Tesla Tire Wear ML

Multi-model ML pipeline predicting tire wear for Tesla vehicles. Random Forest, XGBoost, Neural Network, and Ensemble models with Claude AI analysis for tire longevity insights.

Simulated driving data with vehicle-specific tire degradation modeling
GridSearch-tuned Random Forest, XGBoost, and TensorFlow/Keras neural network
Ensemble averaging across all models for robust predictions
Claude AI integration for natural language tire wear analysis

LLM Wiki

Git-backed knowledge wiki — Karpathy's LLM Wiki pattern with LangGraph ingestion pipelines for structured and unstructured content. Full diff history.

GitHub Activity

📈 Contribution Graph

Skills

Area	Technologies
LLMs & Agents	LiteLLM, Claude/GPT/Gemini APIs, Ollama, llama.cpp, RAG, prompt engineering
ML Infrastructure	PyTorch, Unsloth, DeepSpeed, vLLM, lm-eval, torch.compile, MLflow
Systems	Python, Rust, TypeScript, CUDA, Docker, GitHub Actions
Edge / Hardware	NVIDIA Jetson (Orin, Nano), RTX 5070 Ti, multi-GPU inference
Data	PostgreSQL, SQL, pandas, SQLAlchemy, ETL pipelines

Tremayne Timms · GitHub · LinkedIn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly