Tags: entn-at/transformer-deploy
Tags
Add QAT support to more models (ELS-RD#29) * first version of QDQ monkey patching * add Albert, Electra and Distilbert QAT support * add QDQDeberta V1 * fix distilbert * add ast patch add quant onnx export * simplify quantization process * fix qdq deberta * quantization refactoring * add documentation add quantization tests add deberta v2 * add quant of layernorm refactor ast modif add tests * add operator name in quantizer name update notebook * update notebook * update notebook
Support GPU INT-8 quantization (ELS-RD#15) * support quantization fix some stupid bugs use opset 13 (onnx) * add quantization demo * add dependency * qdqroberta * update quantization notebook * update quantization notebook * update quantization notebook * bump VERSION * delete old script * cleaning * fix ORT to 1.9.0, 1.10.0 seems to be bugged * modify text * update tuto * update tuto * update tuto
Switch from PoC to library (ELS-RD#3) * first python tensorrt commit * better tensorrt code * fix typing * change imports * better typing * fix fp16 issues * improve benchmarks * fix memory issue with Pytorch * generate configurations * fix tests * fix code * remove old file * fix tokenizer * move logic in folder creation * fix issue tokenizer + tensorrt * update code * delete * reformat code * small refactoring * add calibrator * fix multi profile bug * switch code to python package * add colored logs * update README.md * action * fix format * fix github action * removed pycuda * move dependency * add docker github action * fix * fix paths * update README.md add new demo README.md fix demo scripts update Docker image version * fix format * update documentation * add licence * add badge * add emoji * add image * reformat image * add measures * update README.md * update README.md * update README.md * update documentation * update documentation