You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The TFLite model supports FP16 quantization, which is known toreduce the application size. Different with INT8 and INT16 quantization, FP16 does not require calibration.
However, Qualcomm's documentation does not provide the example how to optimize for FP16.
Could you provide some suggestions on how to convert a PyTorch model to an FP16 TFLite model? pytorch -> onnx -> fp16 onnx -> tflite or others way?
Or FP16 quantization is not recommended?
The text was updated successfully, but these errors were encountered:
mestrona-3
added
the
question
Please ask any questions on Slack. This issue will be closed once responded to.
label
Feb 26, 2025
The TFLite model supports FP16 quantization, which is known toreduce the application size. Different with INT8 and INT16 quantization, FP16 does not require calibration.
However, Qualcomm's documentation does not provide the example how to optimize for FP16.
Could you provide some suggestions on how to convert a PyTorch model to an FP16 TFLite model? pytorch -> onnx -> fp16 onnx -> tflite or others way?
Or FP16 quantization is not recommended?
The text was updated successfully, but these errors were encountered: