8000 Restart reproducibility issue with NEPTUNE due to bug in radiation_aerosols.f · Issue #1140 · NCAR/ccpp-physics · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Restart reproducibility issue with NEPTUNE due to bug in radiation_aerosols.f #1140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
climbfuji opened this issue Jun 4, 2025 · 10 comments · May be fixed by ufs-community/ccpp-physics#290
Assignees

Comments

@climbfuji
Copy link
Collaborator
climbfuji commented Jun 4, 2025

In the NEPTUNE CCPP physics, we discovered and fixed a restart reproducibility issue due to the two following lines in radiation_aerosols.f:


The solution was to replace i3=i1 with i3=1 and j3=j1 with j3=1. In a sense, this is removing an optimization. The bug fix is a several years old, therefore we don't remember all the details why this was a problem with NEPTUNE but apparently not with the UFS? Our recollection is that it has to do with the fact that the UFS/FV3 cubed sphere grid uses halos, whereas NEPTUNE does not. This is the commit message from back then:

Fixed a bug in the RRTMG code that interpolates aeresol properities from
the 5-degree database to the model grid. In some circumstances the
interpolation would give a different answer for the same physical point
on different processes. This also impacts the reproducibility of restarts.

We recently tested removing the bug fix and we got the reproducibility issue again.

Since we cannot assume that every model that uses CCPP physics is lucky enough to work with the current code, would it be prudent to remove this optimization in the authoritative codebase? My guess is that the impact on the performance is negligible.

@climbfuji
Copy link
Collaborator Author

@matusmartini FYI

@yangfanglin
Copy link

If the results for the UFS RT are reproducible, I agree we can make this change. @AnningCheng-NOAA Please take a look as well

@climbfuji
Copy link
Collaborator Author

If the results for the UFS RT are reproducible, I agree we can make this change. @AnningCheng-NOAA Please take a look as well

Thanks for your quick response, Fanglin!

@AnningCheng-NOAA
Copy link
Contributor
AnningCheng-NOAA commented Jun 4, 2025 via email

@matusmartini
Copy link
Contributor

Thank you @climbfuji for summarizing this issue here. We started collecting all bugfixes that we have in NEPTUNE and we'll follow-up with more of them. Hopefully this one is straightforward to test. @areinecke is the author of the fix.

@climbfuji
Copy link
Collaborator Author

Let me create a PR for the ufs/dev branch in the ufs-community github organization so that it can be tested with the UFS>

@climbfuji
Copy link
Collaborator Author

ufs-community#290

@dustinswales
Copy link
Member

@climbfuji I agree with @yangfanglin and @AnningCheng-NOAA.
No RT changes, no problem.

@climbfuji
Copy link
Collaborator Author

I will try to run on Hera

@climbfuji
Copy link
Collaborator Author

I ran the full regression tests on Hera and everything was b4b with the previous baseline. Will update the PR description accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants
0