GitHub - ctxnn/toy-gpt: trained a toy decoder only transformer on gita

I have implemented a simple GPT(decoder only transformer) model using PyTorch, and trained it on a toy dataset of GITA. The model is trained to predict the next word in a sentence given the previous words.

about files

toygpt.py: Contains the implementation of the GPT model and training loop
gita.txt: Contains the toy dataset of GITA
toygpt_nb.ipynb: Jupyter notebook containing the code to train and evaluate the model(as I have trained the model on google colab(i am gpu poor lol))
toygpt_gita_output.txt: Contains the generated text by the model

additional information

I have trained a decoder only transformer for this task.
The model is trained on a toy dataset of GITA
The model is trained for 10000 epochs with a batch size of 16 and block size of 32
The model is able to generate coherent text that resembles the training data, but is not able to generate meaningful text beyond the training data.

note :

It is a toy model and is not trained on a large dataset.
The model can be further improved by training on a larger dataset and using a more sophisticated language modeling objective.

input and output difference

here is the image that shows the input(which the model is trained on) and output(the text generated from the model)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
images		images
README.md		README.md
gita.txt		gita.txt
toygpt.py		toygpt.py
toygpt_gita_output.txt		toygpt_gita_output.txt
toygpt_nb.ipynb		toygpt_nb.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

about files

additional information

note :

input and output difference

resources I used

About

Uh oh!

Releases

Packages

Languages

Latest commit

History

Repository files navigation

about files

additional information

note :

input and output difference

resources I used

Uh oh!