Implemented Architectures
- GPT2
- BERT
- Llama
This repository is structured as follows:
- Data Folder contains small text files for simple training
- GPT2 Folder contains training, model, textsampler
- BERT Folder contains Bert Model and Bert Modules
- Llama Folder contains Llama Model with GQA and RoPE Modules
- Implement and experiment with different model architecture.
- Develop foundational components for future research in transformers.
- Provide clean and modular code.
- GPT2 Paper (https://paperswithcode.com/paper/language-models-are-unsupervised-multitask)
- Bert Paper (https://arxiv.org/abs/1810.04805)
- Llama Paper (https://arxiv.org/abs/2307.09288)
Implement more LLMs architectures.
Feel free to contribute to this repository or suggest improvements.