Exploratory RL Agent
Reinforcement learning research investigating autonomous exploration in complex 3D environments. Custom transformer architectures with attention mechanisms and rigorous comparative analysis.
I'm an AI engineer with a research background in deep reinforcement learning. My Master's thesis at Stellenbosch was on environment discovery in unknown 3D worlds: RL agents that learn how to explore on their own.
A few hands-on projects: a complete DQN in pure C with no dependencies, diffusion models I fine-tuned from a base checkpoint, and smaller training experiments where I wanted to see the internals up close.
What I'm pulling toward is frontier research and engineering at AI labs, in areas like model training, alignment, agents, and RL. I want to build AI that's effective, safe, and worth scaling, on problems that have a real shot at mattering.
Reinforcement learning research investigating autonomous exploration in complex 3D environments. Custom transformer architectures with attention mechanisms and rigorous comparative analysis.
A complete Deep Q-Network reinforcement learning algorithm built from scratch in pure C. CartPole environment, neural networks, matrix ops, and visualisation — no external dependencies.
Lead the agentic finance automation stack. Build production agents with structured tooling and knowledge bases to automate finance workflows like reconciliations. Cover backend architecture, security design, and CI/CD around them.
Thesis: Maximising Environment Discovery in Expansive 3D Worlds. Developed reinforcement learning systems and implemented multiple neural network architectures including custom transformer models with attention mechanisms. Supervisor: Prof. Herman Engelbrecht.
Tutored students through tutorials and practicals for the university's computer systems and computer programming modules.
Bootstrapped the front-end of an Enterprise Resource Planning system. Worked on modularity and reusability with an international team and on-boarded new recruits.
Honours Thesis: Parallelising Inference in Probabilistic Graphical Models. Achieved 2.21x speed-up on an 8-core system through efficient parallel implementation.
Always open to interesting conversations around research, engineering, and AI systems.