8000 Releases · quic/ai-hub-models · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
8000

Releases: quic/ai-hub-models

v0.30.2

05 Jun 16:17
Compare
Choose a tag to compare
  • Add option --fetch-static-assets to export scripts.
    • This fetches static assets from Hugging Face (pinned to the qai-hub-models version) instead of compiling through AI Hub. Example:
      python -m qai_hub_models.models.ddrnet23_slim.export --fetch-static-assets --output-dir ddrnet
      
  • Fix bug causing --eval-mode on-device not to work.
  • Changes in DDRnet23-Slim (ddrnet23_slim):
    • Add CityScapes evaluation
    • Change input shape to 1024x2048 (now consistent with model card)
  • Add Carvana evaluation dataset, used by:
    • Unet-Segmentation (unet_segmentation)
  • New variants added:
    • VIT (vit) / w8a16 / ORT
    • Segformer_Base (segformer_base) / w8a16 / ORT
  • Models that were temporarily removed and have been re-instated:
    • Riffusion (riffusion)
    • ControlNet (controlnet)
  • The following variants have been removed due to severe numerical accuracy issues:
    • Beit (beit) / w8a16 / ORT
    • Depth-Anything (depth_anything) / w8a16 / ORT
    • Depth-Anything-V2 (depth_anything_v2) / w8a16 / ORT
    • EfficientNet-B4 (efficientnet_b4) / w8a16 / ORT
    • MobileNet-v3-Small (mobilenet_v3_small) / w8a16 / ORT
    • EffficientViT-l2-seg (efficientvit_l2_seg) / float / ORT
    • Posenet-Mobilenet (posenet_mobilenet) / w8a8 / QNN
    • Posenet-Mobilenet (posenet_mobilenet) / w8a8 / TFLite

v0.29.1

20 May 20:56
Compare
Choose a tag to compare
  • Critical fixes to the export scripts for several LLMs / Gen AI models:
    • Llama 3.0
    • Llama 3.1
    • Llama 3.2
    • Baichuan 2 7B
    • Mistral 7B v0.3
    • Qwen 2 7B
    • Riffusion
    • ControlNet

v0.29

19 May 21:17
Compare
Choose a tag to compare
  • Llama 3 changes:
    • As part of a refactor, these models are re-quantized and may differ slightly without being worse.
    • Model name changed from _chat to _instruct (these models have always been the "Instruct" version so now the name reflects that correctly):
      • llama_v3_8b_chat -> llama_v3_8b_instruct
      • llama_v3_1_8b_chat -> llama_v3_1_8b_instruct
      • llama_v3_2_3b_chat -> llama_v3_2_3b_instruct
  • Whisper small enabled on ONNX

v0.28.2

09 May 19:41
Compare
Choose a tag to compare
  • Add streaming support for the Whisper demo.
  • Add an ONNX session wrapper that makes the session interoperable with PyTorch-based pipelines.
  • Fix issue in the llama script where profiling would fail.

v0.28.1

08 May 18:08
Compare
Choose a tag to compare

New additions:
- Added quantization + evaluation piplines for the following models:
- yolov8_seg
- yolov11_seg
- segformer_base
- mobile_vit
- mask2former
- conditional/deformable detr
- New model fomm
- Allow specifying qnn_context_binary as a target runtime in export
- Fixed export script for 5 LLMs:
- baichuan2_7b
- controlnet
- mistral_7b_instruct_v0_3
- qwen2_7b_instruct
- riffusion
- Website was updated to show quantized and unquantized performance info for a single model in the same webpage

Bug fixes and improvements:
- Removed aimet-torch dependency entirely; all models use quantize job or aimet-onnx; As a result, no models are constrained to linux only
- Allow variable input size for yolov7
- Updated performance numbers for all models
- Moved several quantized models to w8a16 due to accuracy issues with w8a8
- Whisper demo was fixed to run locally on X elite
- Quantized model folders were deleted. Use --quantize in the export script of the unquantized model to create a quantized model

v0.27.1

25 Apr 15:09
Compare
Choose a tag to compare

Updated YOLOv7 export to support varying the image input shape.

v0.27

09 Apr 21:42
Compare
Choose a tag to compare

Models

  • Added MobileSAM
  • Removed YoloNAS and OpenPose

LLama Python Demo

  • Added GPU Support
  • Allow loading prompt from a file
  • Allow passing "raw" prompt to model instead of prepending / appending prompt helper tags
  • Improved top-k/top-p sampling. Prevents poor results.

v0.26.1

09 Apr 21:49
Compare
Choose a tag to compare

New Models:

New Features:

  • Allow customizing number of default calibration samples for quantize job.
  • CLIP has been rewritten as a single model
  • All YOLO class outputs export as int8

Deprecation Note:
Most model folders with the suffix _quantized are being deprecated. You should use the equivalent un-quantized model folder instead.
When running export.py or evaluate.py, use the --quantize or --precision flag to choose the desired precision. The README file for each model will outline what precisions are supported if the model can be quantize

0.25.5

26 Mar 21:27
Compare
Choose a tag to compare

Fixes

  • Important fix for export scripts when using qai-hub 0.26.0.

0.25.2

14 Mar 02:38
Compare
Choose a tag to compare

New models:

  • BGNet (bgnet)
  • BiseNet (bisetnet)
  • DeformableDETR (deformable_detr)
  • NASNet (nasnet)
  • Nomic-Embed-Text (nomic_embed_text)
  • PidNet (pidnet)
  • ResNet-2Plus1D-Quantized (resnet_2plus1d_quantized)
  • ResNet-3D-Quantized (resnet_3d_quantized)
  • ResNet-Mixed-Convolution-Quantized (resnet_mixed_quantized)
  • RTMDet (rtmdet)
  • Video-MAE (video_mae)
  • Video-MAE-Quantized (video_mae_quantized)
  • YamNet (yamnet)
  • Yolo-X (yolox)

Reinstated models:

  • ConvNext-Tiny-W8A16-Quantized (convnext_tiny_w8a16_quantized)
  • Midas-Quantized (midas_quantized)
  • Simple-Bev (simple_bev_cam)
  • Stable-Diffusion-v1.5 (stable_diffusion_v1_5_w8a16_quantized)
  • Whisper-Medium-En (whisper_medium_en)

Performance numbers:

  • Upgraded performance numbers, including an upgrade to QAIRT 2.32.

Bug fixes

  • Llama3-TAIDE-LX-8B-Chat-Alpha1 export bug fix.
  • Various minor bug fixes.
0