Started Working on My First Language Model
Started Working on My First Language Model
Today marks the beginning of a journey I’ve been dreaming about for a long time; I’ve officially started working on my first language model from scratch.
Rather than jumping straight into large-scale architectures like transformers, I’m starting at the very core: building and understanding the foundations step by step. My plan is to begin with simple N-gram models, then progress through:
- Tokenization & preprocessing
- Probability-based language generation
- Markov chains
- And eventually deeper NLP concepts
Why? Because I don’t just want to use AI models. I want to understand how they work, what makes them "think," and how language itself is mathematically modeled. I want to build something from the ground up and learn every piece along the way.
I’ll be documenting my progress, struggles, and breakthroughs here; sharing both code snippets and the thought process behind them. This isn't just about building a model, but about creating a learning journey in public.
Tech stack: Python, Numpy, and whatever else I need along the way.
Next step: Finish my basic N-gram model and generate my first sentences (however broken they may be!).
Stay tuned; this is just the start.