Emicorr speedup #9077

t-brandt · 2025-01-15T03:01:32Z

Closes #

This PR significantly improves the speed and memory usage of emicorr with very small changes: moving an array copy inside a loop, using native array slicing rather than indexing with integer arrays, and (optionally) using bottleneck's implementations of nanmedian. The user is warned if bottleneck is not installed. For jw01668002001_04101_00001_mirimage on my Mac, the EMI step goes from 120s to 30s with bottleneck installed.

Tasks

news fragment change types...

changes/<PR#>.general.rst: infrastructure or miscellaneous change
changes/<PR#>.docs.rst
changes/<PR#>.stpipe.rst
changes/<PR#>.datamodels.rst
changes/<PR#>.scripts.rst
changes/<PR#>.fits_generator.rst
changes/<PR#>.set_telescope_pointing.rst
changes/<PR#>.pipeline.rst

stage 1

changes/<PR#>.group_scale.rst
changes/<PR#>.dq_init.rst
changes/<PR#>.emicorr.rst
changes/<PR#>.saturation.rst
changes/<PR#>.ipc.rst
changes/<PR#>.firstframe.rst
changes/<PR#>.lastframe.rst
changes/<PR#>.reset.rst
changes/<PR#>.superbias.rst
changes/<PR#>.refpix.rst
changes/<PR#>.linearity.rst
changes/<PR#>.rscd.rst
changes/<PR#>.persistence.rst
changes/<PR#>.dark_current.rst
changes/<PR#>.charge_migration.rst
changes/<PR#>.jump.rst
changes/<PR#>.clean_flicker_noise.rst
changes/<PR#>.ramp_fitting.rst
changes/<PR#>.gain_scale.rst

stage 2

changes/<PR#>.assign_wcs.rst
changes/<PR#>.badpix_selfcal.rst
changes/<PR#>.msaflagopen.rst
changes/<PR#>.nsclean.rst
changes/<PR#>.imprint.rst
changes/<PR#>.background.rst
changes/<PR#>.extract_2d.rst
changes/<PR#>.master_background.rst
changes/<PR#>.wavecorr.rst
changes/<PR#>.srctype.rst
changes/<PR#>.straylight.rst
changes/<PR#>.wfss_contam.rst
changes/<PR#>.flatfield.rst
changes/<PR#>.fringe.rst
changes/<PR#>.pathloss.rst
changes/<PR#>.barshadow.rst
changes/<PR#>.photom.rst
changes/<PR#>.pixel_replace.rst
changes/<PR#>.resample_spec.rst
changes/<PR#>.residual_fringe.rst
changes/<PR#>.cube_build.rst
changes/<PR#>.extract_1d.rst
changes/<PR#>.resample.rst

stage 3

changes/<PR#>.assign_mtwcs.rst
changes/<PR#>.mrs_imatch.rst
changes/<PR#>.tweakreg.rst
changes/<PR#>.skymatch.rst
changes/<PR#>.exp_to_source.rst
changes/<PR#>.outlier_detection.rst
changes/<PR#>.tso_photometry.rst
changes/<PR#>.stack_refs.rst
changes/<PR#>.align_refs.rst
changes/<PR#>.klip.rst
changes/<PR#>.spectral_leak.rst
changes/<PR#>.source_catalog.rst
changes/<PR#>.combine_1d.rst
changes/<PR#>.ami.rst

other

changes/<PR#>.wfs_combine.rst
changes/<PR#>.white_light.rst
changes/<PR#>.cube_skymatch.rst
changes/<PR#>.engdb_tools.rst
changes/<PR#>.guider_cds.rst

codecov · 2025-01-15T03:39:15Z

Codecov Report

Attention: Patch coverage is 81.25000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 78.35%. Comparing base (d5d79e9) to head (92086e9).
Report is 936 commits behind head on main.

Files with missing lines	Patch %	Lines
jwst/emicorr/emicorr.py	81.25%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #9077      +/-   ##
==========================================
+ Coverage   78.14%   78.35%   +0.20%     
==========================================
  Files         505      505              
  Lines       46308    46057     -251     
==========================================
- Hits        36188    36087     -101     
+ Misses      10120     9970     -150

Flag	Coverage Δ		*Carryforward flag
nightly	`77.78% <ø> (+<0.01%)`	⬆️	Carriedforward from d5d79e9

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

braingram · 2025-01-15T13:37:15Z

Regression tests were run here:
https://github.com/spacetelescope/RegressionTests/actions/runs/12788227847/job/35649174699
against a modified branch:
https://github.com/braingram/jwst/tree/require_bottleneck
that forced installation of bottleneck (to work around https://github.com/spacetelescope/RegressionTests/issues/190).
The run shows 1 failure in an unrelated test:

jwst/jwst/clean_flicker_noise/tests/test_clean_flicker_noise.py

Line 255 in 8d84c0c

with pytest.warns(RuntimeWarning):

likely due to astropy relying on bottleneck and no longer emitting a warning for all nan input.

drlaw1558 · 2025-01-23T22:06:57Z

Tested (after removal of the bottleneck code) with jw01283001001_03101_00001_mirimage_uncal.fits (full frame, 10 ints, 100 groups), jw01668002001_04101_00001_mirimage_uncal.fits (4qpm subarray, 6 ints, 1251 groups), and jw04496004001_03103_00001-seg001_mirimage_uncal.fits (slitless subarray, 288 ints, 10 groups).

Runtime goes from 127 to 80 seconds, 127 to 42 seconds, and 12 to 9 seconds respectively.
Resulting rateint files before/after this change are identical.

kmacdonald-stsci

It looks fine to me. I am afraid I don't understand the ins and outs of numpy arrays to understand why these changes improved performance/memory consumption.

kmacdonald-stsci · 2025-01-27T23:12:04Z

jwst/emicorr/emicorr.py

@@ -282,30 +280,32 @@ def apply_emicorr(output_model, emicorr_model,

        for ninti in range(nints):
            log.debug('  Working on integration: {}'.format(ninti+1))
-
+            # Read in this integration
+            data = output_model.data[ninti].copy()


Is there a reason this copy improves performance/memory consumption? It seems weird to me that you couldn't just be set as data = data[ninti].

@kmacdonald-stsci there are three reasons for the improvement:

The copy of data is done inside a loop, one integration at a time, rather than copying the entire array outside the loop. This array is being modified inside the loop and I have not checked in detail to see if it is safe not to copy; it was much easier to just copy one slice at a time rather than check to see if the copy is absolutely necessary.

np.where is expensive! For example, np.where(x == 2) first evaluates x == 2 to create a boolean array, and then numpy constructs one int64 array for each dimension of x. As another example,
badpix = np.where(~np.isfinite(x))
x[badpix] = 0
will likely be much slower than
x[~np.isfinite(x)] = 0
or, equivalently, if you want to be explicit,
badpix = ~np.isfinite(x)
x[badpix] = 0
The latter two options never create int64 arrays.

Finally, array slicing is much more efficient than advanced indexing, which requires a more general loop and array copy under the hood. Example:
np.mean(x[1::4])
is much faster than
np.mean(x[np.arange(1, len(x), 4)])
Point (1) here is why memory usage is less; points (2) and (3) are why it runs faster.

melanieclarke · 2025-01-29T22:20:36Z

Running regression tests here:
https://github.com/spacetelescope/RegressionTests/actions/runs/13041530658

melanieclarke

Regression tests are passing. Changes here (without bottleneck) look good.

Timothy Brandt and others added 9 commits December 13, 2024 13:20

Small changes, including bottleneck dependency, for speedup

ab7c76d

added bottleneck dependency

1c2ad46

Removed testing break

75bd6c0

Merge branch 'spacetelescope:main' into emicorr_speedup

7901020

Removed unnecessary array copy

7f7f676

Merge branch 'spacetelescope:main' into emicorr_speedup

465ea0f

Changed bottleneck to a soft dependency

103710a

Revised soft dependency

0497751

Revised soft dependency

8ca1a04

t-brandt requested a review from a team as a code owner January 15, 2025 03:01

stscijgbot-jp mentioned this pull request Jan 15, 2025

Speed up emicorr, consider use of bottleneck package #9008

Closed

Town crier for 9077 emicorr

bbc844f

t-brandt requested a review from a team as a code owner January 15, 2025 03:10

Fixed ruff style for unused error

1b8d758

Removed soft bottleneck dependency

aee27ab

kmacdonald-stsci approved these changes Jan 27, 2025

View reviewed changes

melanieclarke added this to the Build 11.3 milestone Jan 29, 2025

melanieclarke added 2 commits January 29, 2025 17:16

Merge branch 'main' into emicorr_speedup

aae2889

Fix code style

d5d79e9

Merge branch 'main' into emicorr_speedup

92086e9

melanieclarke approved these changes Jan 30, 2025

View reviewed changes

melanieclarke merged commit 17b3a69 into spacetelescope:main Jan 30, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emicorr speedup #9077

Emicorr speedup #9077

Emicorr speedup #9077

Emicorr speedup #9077

Conversation

Tasks

stage 1

stage 2

stage 3

other

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment