-
Notifications
You must be signed in to change notification 8000 settings - Fork 112
Evaluation metrics(FID) #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @Zhu1116 , May I check the inference code you used? Thanks! |
for sent_key,sent_info in tqdm(content):
imgid=str(sent_key).zfill(12)
img_prompt_path = os.path.join(condition_image_folder_path, f"{imgid}.jpg")
image = Image.open(img_prompt_path).convert("RGB")
w, h, min_dim = image.size + (min(image.size),)
image = image.crop(
((w - min_dim) // 2, (h - min_dim) // 2, (w + min_dim) // 2, (h + min_dim) // 2)
).resize((512, 512))
condition = Condition("canny", image)
result_img = generate(
pipe,
prompt=sent_info[0],
conditions=[condition],
).images[0]
result_img.save(os.path.join(save_dir_gen, f"{imgid}.jpg"))
image.save(os.path.join(save_dir_resized, f"{imgid}.jpg"))
condition.condition.save(os.path.join(save_dir_canny, f"{imgid}.jpg")) like this,prompts and images are from coco val dataset |
Almostly copy from examples/spatial.ipynb |
Hi @Zhu1116 , The inference code looks good and the only thing different from our implementation is that there is an extra central croping. But I dont think it will cause such a performance drop.
|
oh, use the original coco images? I use the central cropped 512x512 images |
No. Actually you should use the central cropped images. |
Could you please your email for further discussion? |
OK, OK. Thank you very much for your careful reply. |
Sure, I'd be more than happy to. 246939556@qq.com |
sorry, email is wrong. 2469395556@qq.com |
Great work! I have a question here. When I was replicating the paper, under the conditional control of Canny, I generated 1000 images. Then I used the following code to calculate the FID metric and got a result of around 70, which is quite different from the result of more than twenty in the paper. May I ask what the reason for this is?
The text was updated successfully, but these errors were encountered: