MSc Computer Science & AI candidate at Leiden University, building decision-grade AI systems where rigorous research meets reliable software.
I work across machine learning, reinforcement learning, multi-objective optimisation, LLM evaluation, and data engineering. My focus is practical: building reproducible systems that can be measured, stress-tested, and improved rather than prototypes that merely look clever in a notebook.
Based in Leiden, Netherlands. Open to AI/ML Engineering, Applied AI Research, Data & AI Engineering, and graduate opportunities.
Multi-agent systems · Planning · Reinforcement learning
A Capture-the-Flag agent that combines MCTS-inspired offensive planning with rule-based defensive behaviour under partial observability. Built to explore the trade-off between long-horizon search, adversarial reasoning, and robust real-time decisions.
Python MCTS Multi-Agent Systems Game AI Search
Multi-objective optimisation · Logistics · Decision support
An NSGA-II container-loading optimiser that balances unloading order, vessel stability, slot utilisation, and stacking constraints. Produces Pareto-efficient solutions rather than pretending one operational objective matters more than the rest.
Python NSGA-II Optimisation Analytics Logistics
LLM evaluation · Quality assurance · Streamlit
A Streamlit evaluation workbench for reviewing Q&A model outputs with editable annotations, issue categories, filtering, and quality metrics. Designed to make model-evaluation feedback structured, inspectable, and useful for iteration.
Python Streamlit LLM Evaluation Data Quality Human Feedback
Languages: Python, SQL, C++, Java Machine Learning: PyTorch, TensorFlow, Keras, Scikit-learn, XGBoost, LightGBM AI & Research: Reinforcement Learning, NLP, LLMs, RAG, Optimisation, Explainable AI, Secure ML Data & Engineering: Pandas, NumPy, PySpark, Spark, Kafka, ETL/ELT, REST APIs, Docker Cloud & Tools: AWS, Azure, GCP, Git, GitHub Actions, Streamlit, Flask Quantum: Qiskit, Cirq, Quantum Algorithms
- More rigorous AI evaluation workflows for LLM and RAG systems: failure taxonomy, human feedback loops, reproducible benchmarks, and useful quality metrics.
- More production-minded AI/data applications: validation, observability, testing, clean interfaces, and deployment-ready project structure.
- Deeper capability in reinforcement learning, multi-objective optimisation, secure AI, and quantum computing, with an emphasis on problems where these methods offer a real advantage rather than decorative complexity.
- Start with the decision or user problem, not the model.
- Measure baselines before claiming improvement.
- Treat reproducibility, testing, and documentation as features.
- Prefer simple systems that can be inspected over complex systems nobody can debug.
- Build for measurable operational value.
Applied AI · AI Engineering · Reinforcement Learning · LLM Evaluation · Optimisation · Decision Intelligence · Data Engineering · Secure & Responsible AI · Quantum Computing
Building AI systems that survive contact with real constraints.

