Lightweight implementation of GPT-2 decoder only transformer architecture based on the makemore series by Andrej Karpathy.
The main file is gpt.py
The model was trained on google colab, checkpoint file is model.pt
gpt.py
contains a lot of comments, this file is purely for educational purposes, its not intended to be production ready by any means