llm-sparsification-release

task

Understand and distinguish these concepts:
- Sparsification
- Pruning
- Quantization
- Distillation
- MoEfication
Choose your models. Pick 3 models, 1 from each category. Each pick should be of more than 1B parameters before pruning.
- Encoder-only
- Decoder-only
- Encoder-Decoder
You can find info about model size at https://openbmb.github.io/BMList/list/. You may use huggingface or other modelhub that you see fit.
Devise approaches to assess sparsity structure in your choice of models and answer these questiosn:
- what fraction of parameters >> 0? overall? by layer?
- how does this vary by layer?
Produce sparsified versions of your models at 10%, 50%, 90%, 95%, 99%, by either coding your methods or using existing tools provided below Explain the nature of your methods, regardless of whether you code it yourselves.
Find 2 common benchmarks used by your models, by reviewing their publications.
Set them up and obtain baseline results of original models.
Compare performance of your sparsified versions with the baselines. Include plots and explanations.
Compare size of models and runtime for sparsified models. Include plots and explanations.
Explain the challenges of sparsification on LLMs.

Due: Nov 9th, 12 PM CST
Fork your public Github repository, change the repo name to llm-sparsification-<cnetid>
we will look out for the following files:
- report.md
- src/*
- requirements.txt for pip or environment.yml for conda
- any jupyter notebooks

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
HW4.ipynb		HW4.ipynb
README.md		README.md
hw4 (1).ipynb		hw4 (1).ipynb
notes.md		notes.md