Production-grade, test-driven encoder-only Transformer built from scratch with clean architecture and modular design.
nlp deep-learning transformers pytorch clean-architecture neural-networks educational from-scratch modular-design test-driven self-attention multi-head-attention transformer-architecture encoder-only
-
Updated
Apr 21, 2026 - Python