YaRN

YaRN: Efficient Context Window Extension of Large Language Models

This repo contains the code and data for the YaRN context window extension method.

Awaiting arXiv announcement, citation will go here!

Models

We publish 7B and 13B variants of LLaMA 2 fine-tuned with YaRN at 64K and 128K context window length. They are available under the LLaMA 2 license on 🤗 Hugging Face.

Size	Context	Link
7B	64K	NousResearch/Yarn-Llama-2-7b-64k
7B	128K	NousResearch/Yarn-Llama-2-7b-128k
13B	64K	NousResearch/Yarn-Llama-2-13b-64k
13B	128K	NousResearch/Yarn-Llama-2-13b-128k

Reproduction

We strongly believe in open science, and thus publish all code and data to reproduce the results in our paper. To reproduce, clone the repository and perform a local installation.

git clone https://github.com/jquesnelle/yarn
cd yarn
pip install -e .

Training

To train the models, run accelerate config and enable DeepSpeed acceleration. deepspeed/zero3.json was the configuration file used for training.

# ./train.sh

The tokenized training data is available on Hugging Face and was derived from the pg19 dataset.

Evaluation

To reproduce the evaluations, install lm-evaluation-harness with pip install git+https://github.com/EleutherAI/lm-evaluation-harness and then run the two provided scripts.

# ./eval.sh
# ./eval-harness.sh

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
deepspeed		deepspeed
eval		eval
paper		paper
scaled_rope		scaled_rope
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval-harness.sh		eval-harness.sh
eval.sh		eval.sh
finetune.py		finetune.py
requirements.txt		requirements.txt
setup.py		setup.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YaRN

Models

Reproduction

Training

Evaluation

About

Releases

Packages

Languages

License

mohamadmansourX/yarn

Folders and files

Latest commit

History

Repository files navigation

YaRN

Models

Reproduction

Training

Evaluation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages