基于bert4keras的GLUE基准代码
-
Updated
Jan 30, 2022 - Python
8000
基于bert4keras的GLUE基准代码
Effective Knowledge Distillation Generalization for Language Models
本项目探索并实现超越传统 Transformer 架构的高效序列建模方法,重点关注状态空间模型(SSM)Mamba 和线性注意力机制等新型架构。项目基于Pytorch框架,从零设计实现了一套完整的模型训练、评估、记录和可视化方案,并完成 GLUE Benchmark 和 LRA 的适配工作。
Add a description, image, and links to the gluebenchmark topic page so that developers can more easily learn about it.
To associate your repository with the gluebenchmark topic, visit your repo's landing page and select "manage topics."