From Tokensto Agents
Rebuilding the Modern LLM Stack Through 34 Engineering Projects
A systematic journey through tokenization, embeddings, attention, transformers, training, inference, long-context systems, MoE architectures, post-training, serving, agents, multimodal AI, interpretability, and complete LLM systems.
Interactive Systems
Live visualizations of core LLM components. These are not screenshots—they are interactive demonstrations of the concepts explored in each project.
Tokenization Pipeline
Input string as continuous characters
Attention Heatmap
Causal self-attention pattern visualization
Transformer Architecture
Interactive decoder block diagram
Multi-Head Attention
Computes attention scores between all positions. Multiple heads attend to different aspects (syntax, semantics, position) simultaneously.
The LLM Engineering Roadmap
Featured Engineering Case Studies
Deep dives into the most impactful projects—each representing a critical component of the modern LLM stack.
Research Interests
Areas I am actively exploring and contributing to through implementation, experimentation, and open-source work.
LLM Architecture
Designing more efficient, capable, and interpretable language model architectures. Exploring alternatives to the standard transformer stack including state-space models, linear attention, and mixture-of-experts systems.
Key Papers
Efficient Inference
Optimizing LLM inference through algorithmic improvements (FlashAttention, speculative decoding), system optimizations (continuous batching, KV cache management), and hardware-aware design.
Key Papers
Model Compression
Reducing model size and inference cost through quantization, pruning, knowledge distillation, and architecture search. Focus on maintaining quality while achieving dramatic speedups.
Key Papers
Long Context Systems
Extending transformer context windows to millions of tokens through architectural innovations, memory systems, and position encoding advances. Applications in document analysis, code understanding, and multi-turn conversation.
Key Papers
Agentic AI
Building autonomous systems that can plan, reason, use tools, and interact with environments. Focus on reliability, safety, and capability in multi-step reasoning tasks.
Key Papers
Multimodal Models
Extending language models to understand and generate across vision, audio, and other modalities. Focus on efficient alignment and unified representation learning.
Key Papers
Interpretability
Understanding the internal mechanisms of language models through mechanistic interpretability, feature visualization, and circuit tracing. Goal: making AI systems understandable and auditable.
Key Papers
Open Source AI
Contributing to open, reproducible, and accessible AI research. Building tools, datasets, and models that democratize access to state-of-the-art AI capabilities.
Key Papers