8000 [Face attributes Model] Output of sunglasses probability is always 1 · Issue #179 · quic/ai-hub-models · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Face attributes Model] Output of sunglasses probability is always 1 #179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chouxscream opened this issue Mar 17, 2025 · 4 comments
Open
Labels
question Please ask any questions on Slack. This issue will be closed once responded to.

Comments

@chouxscream
Copy link
chouxscream commented Mar 17, 2025

Hello, I have question on face attributes model output.

While other outputs seems to be correct after using softmax function, output of sunglasses is always equal to or almost 1.0.
I have not used softmax to sunglasses, as pointed out in #137.
Regardless of input images with or without sunglasses on, the output does not represent sunglasses correctly.
Am I missing something, such as preprocessing input data? Below is the command that I have used.

python -m qai_hub_models.models.face_attrib_net.demo --image {image_path}

Thank you in advance.

@mestrona-3 mestrona-3 added the question Please ask any questions on Slack. This issue will be closed once responded to. label Mar 17, 2025
@mestrona-3
Copy link

Hi @chouxscream, thanks for the question! I've asked the team that worked on this model and they have some follow up questions:

Yes, you are handling the sunglasses result correctly, and there's no need to perform softmax. Though we'll need more information to pinpoint the exact problem. If you could provide some failure cases, it would be easier for us to identify the root cause. Are you using RGB data?

Regarding preprocessing, for optimal results, we perform an affine transformation to the images and crop the faces according to the following criteria: center of eyes and mouth are supposed to be well aligned horizontally and the distance from the top of the image to the eyes, from the eyes to the mouth, and from the mouth to the bottom should all equal to 1/3 of the image height.

@chouxscream
Copy link
Author

Yes, I used RGB data.

This is the output using sample face image from qualcomm :
Image
"sunglasses": [
0.9999938011169434,
6.184478934301296e-06
]

This is the output using sunglass face image downloaded from internet :
Image
"sunglasses": [
0.99959796667099,
0.0004019975021947175
]

I have uesd SCRFD detector for cropping face from input image, so should I use other face detection method before using attribute model?

@mestrona-3
Copy link

Hi chouxscream, thank you for the additional details. You're right it does appear that the results are incorrect and significantly differ from our previous runs, both online and offline. We'll need more information to dig into the root cause.

Could you provide us with the following:

  • which device are you deploying to
  • can you provide some unit test data with groundtruth for us to investigate the model use with proper preprocessing/post processing

In the meantime, we'd recommend trying the quantized version as the accuracy should improve, depending on the platform you're deploying the model on. We also wanted to share that we'll soon be updating a new version trained using RGB instead of NIR data, so you should expect better accuracy in the near future release.

@chouxscream
Copy link
Author

which device are you deploying to

I plan to export onnx model to other NPU device and since quantization is applied separately, I did not use quantized version from qualcomm.

We also wanted to share that we'll soon be updating a new version trained using RGB instead of NIR data,

So should I use NIR image input instead of RGB?

can you provide some unit test data with groundtruth for us to investigate the model use with proper preprocessing/post processing

For preprocessing, crop face bounding box from scrfd-10g detector and resize to (128,128)
No postprocessing is applied after inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Please ask any questions on Slack. This issue will be closed once responded to.
Projects
None yet
Development

No branches or pull requests

2 participants
0