GitHub - paperwave/nGPT-pytorch: Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI

nGPT (normalized GPT) - Pytorch

Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI. The question is whether there is any loss of expressivity they swept under the rug, but I'll take it with good faith.

Citations

@inproceedings{Loshchilov2024nGPTNT,
    title   = {nGPT: Normalized Transformer with Representation Learning on the Hypersphere},
    author  = {Ilya Loshchilov and Cheng-Ping Hsieh and Simeng Sun and Boris Ginsburg},
    year    = {2024},
    url     = {https://api.semanticscholar.org/CorpusID:273026160}
}

@article{Luo2017CosineNU,
    title     = {Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks},
    author    = {Chunjie Luo and Jianfeng Zhan and Lei Wang and Qiang Yang},
    journal   = {ArXiv},
    year      = {2017},
    volume    = {abs/1702.05870},
    url       = {https://api.semanticscholar.org/CorpusID:1505432}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
nGPT_pytorch		nGPT_pytorch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

nGPT (normalized GPT) - Pytorch

Citations

About

Uh oh!

Releases

Packages

Languages

License

paperwave/nGPT-pytorch

Folders and files

Latest commit

History

Repository files navigation

nGPT (normalized GPT) - Pytorch

Citations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages