AI Engineer · Agent-First Development · Harness Engineering
VinUniversity · Vietnam 🇻🇳
I turn fresh research papers into runnable, verifiable demos.
A series I call Build with Paper — read the paper, ship the demo, prove it runs.
- Build with Paper — I pick a recent arXiv paper, reproduce its core idea, and ship a demo you can actually run and check.
- Focus — LLM routing & deliberation, agentic evaluation (honest oracles), RL post-training (GRPO / self-play), and data engines for Vietnamese.
- Harness engineering — the scaffolding that makes agents reliable, cheap, and verifiable.
- Local-first — iterate on a Mac (MLX / Metal); rent GPUs (H100 / H200) only when weights need to move.
Routing & deliberation
- super-agent — a unified cost-aware router: model × reasoning-depth on one Thompson bandit, drift-aware memory, and a GRPO-internalized Qwen3-4B depth policy.
- System-III-Router — learned deliberation routing from Critique of Agent Model (arXiv:2606.23991); a bandit over reasoning depths (direct → cot → plan+verify), live on Vi-GSM8K.
Honest evaluation
- rtl-gauntlet — a two-tier honest-evaluation harness for agentic RTL design, backed by a formal oracle.
- agent-memory-lab — a runnable lab measuring invalidation & staleness in agent memory (arXiv:2606.24775).
Data & post-training
- vi-gsm8k-agentic — Vietnamese GSM8K via agentic self-instruct; beats machine-translated data (Qwen3-4B: 81.0% vs 76.5%). → dataset on Hugging Face
- spiced-mini — tiny-scale, runnable demos of SPICED self-play (arXiv:2606.19370) on a Mac M5.
Multi-objective & control
- svh-mol — first public repro of Annealed Stein Variational Hypernetworks (arXiv:2506.06715): one hypernetwork traces the whole Pareto front, plus GRPO-learned annealing and an LLM multi-objective-alignment bridge. Multi-seed verified,
$0on a Mac.
Python · PyTorch · Transformers · TRL / GRPO · vLLM · MLX · Hugging Face · Docker
Agents: Claude Code · custom harnesses · Thompson-bandit routers
GitHub · vuongsky55.cv@gmail.com
Read the paper. Ship the demo. Prove it runs.



