Evan Owen ulmentflam

Hey, I'm Evan

AI researcher and systems engineer with 10+ years shipping production systems at scale. Former Co-Founder & CTO at QWERKY AI, where I distilled 70B-parameter LLMs into 3B-8B hybrid models on 24 H200 GPUs with a pending patent on novel attention architectures. Currently pursuing my MS in Computer Science at Georgia Tech (BS CS, Summa Cum Laude, University of South Carolina). I've led teams of 20+ engineers and shipped 15+ production applications across AI, blockchain, and distributed systems.

What I'm Working On

LLM Architecture Research -- Custom CUDA kernels for novel attention mechanisms (pending patent)
State Space Models -- Contributed Mamba SSM architecture to Modular's MAX framework in Mojo
QDistill -- 70B→3B-8B hybrid distillation achieving 4x throughput and 1M token context lengths
Open Source -- Selective scan, causal conv1d, and RMSNorm kernels in the Modular ecosystem

Tech Stack

Languages

AI / ML

Infrastructure

Featured Work

Modular MAX Framework Mamba SSM architecture with custom selective scan, causal conv1d, and RMSNorm kernels in Mojo	Pulley iOS Maps-style drawer library with 2k+ stars, created at 52 Inc.
QWERKY AI Distilled 70B→3B-8B hybrid models on 24 H200 GPUs. 4x inference throughput, 1M token context. Pending patent on novel attention architecture.	key-gen BIP-0044 compatible multi-blockchain key generator