-
Notifications
You must be signed in to change notification settings - Fork 174
Emicorr speedup #9077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emicorr speedup #9077
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9077 +/- ##
==========================================
+ Coverage 78.14% 78.35% +0.20%
==========================================
Files 505 505
Lines 46308 46057 -251
==========================================
- Hits 36188 36087 -101
+ Misses 10120 9970 -150
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Regression tests were run here:
likely due to astropy relying on bottleneck and no longer emitting a warning for all nan input. |
Tested (after removal of the bottleneck code) with jw01283001001_03101_00001_mirimage_uncal.fits (full frame, 10 ints, 100 groups), jw01668002001_04101_00001_mirimage_uncal.fits (4qpm subarray, 6 ints, 1251 groups), and jw04496004001_03103_00001-seg001_mirimage_uncal.fits (slitless subarray, 288 ints, 10 groups). Runtime goes from 127 to 80 seconds, 127 to 42 seconds, and 12 to 9 seconds respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine to me. I am afraid I don't understand the ins and outs of numpy arrays to understand why these changes improved performance/memory consumption.
@@ -282,30 +280,32 @@ def apply_emicorr(output_model, emicorr_model, | |||
|
|||
for ninti in range(nints): | |||
log.debug(' Working on integration: {}'.format(ninti+1)) | |||
|
|||
# Read in this integration | |||
data = output_model.data[ninti].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this copy improves performance/memory consumption? It seems weird to me that you couldn't just be set as data = data[ninti]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kmacdonald-stsci there are three reasons for the improvement:
- The copy of data is done inside a loop, one integration at a time, rather than copying the entire array outside the loop. This array is being modified inside the loop and I have not checked in detail to see if it is safe not to copy; it was much easier to just copy one slice at a time rather than check to see if the copy is absolutely necessary.
np.where
is expensive! For example,np.where(x == 2)
first evaluatesx == 2
to create a boolean array, and thennumpy
constructs oneint64
array for each dimension ofx
. As another example,
badpix = np.where(~np.isfinite(x))
x[badpix] = 0
will likely be much slower than
x[~np.isfinite(x)] = 0
or, equivalently, if you want to be explicit,
badpix = ~np.isfinite(x)
x[badpix] = 0
The latter two options never createint64
arrays.- Finally, array slicing is much more efficient than advanced indexing, which requires a more general loop and array copy under the hood. Example:
np.mean(x[1::4])
is much faster than
np.mean(x[np.arange(1, len(x), 4)])
Point (1) here is why memory usage is less; points (2) and (3) are why it runs faster.
Running regression tests here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regression tests are passing. Changes here (without bottleneck) look good.
Resolves JP-3819
Closes #
This PR significantly improves the speed and memory usage of emicorr with very small changes: moving an array copy inside a loop, using native array slicing rather than indexing with integer arrays, and (optionally) using bottleneck's implementations of nanmedian. The user is warned if bottleneck is not installed. For jw01668002001_04101_00001_mirimage on my Mac, the EMI step goes from 120s to 30s with bottleneck installed.
Tasks
Build 11.3
(use the latest build if not sure)no-changelog-entry-needed
)changes/
:echo "changed something" > changes/<PR#>.<changetype>.rst
(see below for change types)docs/
pageokify_regtests
to update the truth filesnews fragment change types...
changes/<PR#>.general.rst
: infrastructure or miscellaneous changechanges/<PR#>.docs.rst
changes/<PR#>.stpipe.rst
changes/<PR#>.datamodels.rst
changes/<PR#>.scripts.rst
changes/<PR#>.fits_generator.rst
changes/<PR#>.set_telescope_pointing.rst
changes/<PR#>.pipeline.rst
stage 1
changes/<PR#>.group_scale.rst
changes/<PR#>.dq_init.rst
changes/<PR#>.emicorr.rst
changes/<PR#>.saturation.rst
changes/<PR#>.ipc.rst
changes/<PR#>.firstframe.rst
changes/<PR#>.lastframe.rst
changes/<PR#>.reset.rst
changes/<PR#>.superbias.rst
changes/<PR#>.refpix.rst
changes/<PR#>.linearity.rst
changes/<PR#>.rscd.rst
changes/<PR#>.persistence.rst
changes/<PR#>.dark_current.rst
changes/<PR#>.charge_migration.rst
changes/<PR#>.jump.rst
changes/<PR#>.clean_flicker_noise.rst
changes/<PR#>.ramp_fitting.rst
changes/<PR#>.gain_scale.rst
stage 2
changes/<PR#>.assign_wcs.rst
changes/<PR#>.badpix_selfcal.rst
changes/<PR#>.msaflagopen.rst
changes/<PR#>.nsclean.rst
changes/<PR#>.imprint.rst
changes/<PR#>.background.rst
changes/<PR#>.extract_2d.rst
changes/<PR#>.master_background.rst
changes/<PR#>.wavecorr.rst
changes/<PR#>.srctype.rst
changes/<PR#>.straylight.rst
changes/<PR#>.wfss_contam.rst
changes/<PR#>.flatfield.rst
changes/<PR#>.fringe.rst
changes/<PR#>.pathloss.rst
changes/<PR#>.barshadow.rst
changes/<PR#>.photom.rst
changes/<PR#>.pixel_replace.rst
changes/<PR#>.resample_spec.rst
changes/<PR#>.residual_fringe.rst
changes/<PR#>.cube_build.rst
changes/<PR#>.extract_1d.rst
changes/<PR#>.resample.rst
stage 3
changes/<PR#>.assign_mtwcs.rst
changes/<PR#>.mrs_imatch.rst
changes/<PR#>.tweakreg.rst
changes/<PR#>.skymatch.rst
changes/<PR#>.exp_to_source.rst
changes/<PR#>.outlier_detection.rst
changes/<PR#>.tso_photometry.rst
changes/<PR#>.stack_refs.rst
changes/<PR#>.align_refs.rst
changes/<PR#>.klip.rst
changes/<PR#>.spectral_leak.rst
changes/<PR#>.source_catalog.rst
changes/<PR#>.combine_1d.rst
changes/<PR#>.ami.rst
other
changes/<PR#>.wfs_combine.rst
changes/<PR#>.white_light.rst
changes/<PR#>.cube_skymatch.rst
changes/<PR#>.engdb_tools.rst
changes/<PR#>.guider_cds.rst