see: https://bonany.cc/attentionlego/
Impelementation details are in this arxiv.XXX paper.
@misc{cong2024attentionlego, title={AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology}, author={Rongqing Cong and Wenyang He and Mingxuan Li and Bangning Luo and Zebin Yang and Yuchao Yang and Ru Huang and Bonan Yan}, year={2024}, eprint={2401.11459}, archivePrefix={arXiv}, primaryClass={cs.AR} }