HeadlinesBriefing favicon HeadlinesBriefing.com

Hands-on LLM Workshop From Scratch

Hacker News •
×

A new workshop by angelos-p/llm-from-scratch offers developers the chance to build their own GPT model from scratch, inspired by Andrej Karpathy's nanoGPT. The project scales down to a ~10M param model that trains on a laptop in under an hour, making it practical for a single workshop session without requiring expensive hardware or ML expertise.

The workshop teaches participants to build every component themselves: tokenizer, transformer architecture, training loop, and text generation. It uses character-level tokenization optimized for small datasets like Shakespeare text, with three model configurations ranging from 0.5M to 10M parameters. The code automatically detects and utilizes available hardware from Apple Silicon to NVIDIA GPUs or CPU fallback.

Participants will finish with a working GPT model capable of generating Shakespeare-like text, having written the entire pipeline themselves. The workshop progresses through each component conceptually and practically, ending with a competition to train the best "AI poet" using the learned techniques.