8000 JP-2664: Fix NRS-MOS source ids to be 5 digits and positive by tapastro · Pull Request #6915 · spacetelescope/jwst · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

JP-2664: Fix NRS-MOS source ids to be 5 digits and positive #6915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 8, 2022

Conversation

tapastro
Copy link
Contributor
@tapastro tapastro commented Jul 7, 2022

Closes #
Resolves JP-2664

Description

This PR adds a check on the source_id created in calwebb_spec3, and modifies values that fall outside 0-99999. The wording in the JP ticket indicated this is a temporary fix, so it has been worded/coded as such; if I misinterpreted the level of 'hotfix' involved, I can edit the addition.

Checklist

  • Tests
  • Documentation
  • Change log
  • Milestone
  • Label(s)

@codecov
Copy link
codecov bot commented Jul 7, 2022

Codecov Report

Merging #6915 (a9c5cde) into master (f10bf3b) will decrease coverage by 0.13%.
The diff coverage is 54.83%.

@@            Coverage Diff             @@
##           master    #6915      +/-   ##
==========================================
- Coverage   79.48%   79.35%   -0.14%     
==========================================
  Files         418      418              
  Lines       37352    37420      +68     
==========================================
+ Hits        29691    29694       +3     
- Misses       7661     7726      +65     
Flag Coverage Δ *Carryforward flag
nightly 79.45% <100.00%> (-0.02%) ⬇️ Carriedforward from 3dec2d5
unit 53.25% <4.00%> (-0.04%) ⬇️

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
jwst/pipeline/calwebb_spec3.py 56.49% <54.83%> (-26.08%) ⬇️
jwst/associations/lib/log_config.py 89.55% <0.00%> (-10.45%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f10bf3b...a9c5cde. Read the comment docs.

@tapastro tapastro changed the title JP-2664: Fix NRS-MOS source ids to be <=5 digits and positive JP-2664: Fix NRS-MOS source ids to be 5 digits and positive Jul 7, 2022
@tapastro
Copy link
Contributor Author
tapastro commented Jul 7, 2022

@hbushouse Addressed the source_id listed in the datamodels - should there be a log message mapping old to new source_ids?

@hbushouse
Copy link
Collaborator

@hbushouse Addressed the source_id listed in the datamodels - should there be a log message mapping old to new source_ids?

Yes, I think a log message would be nice to have.

@hbushouse
Copy link
Collaborator

Some test data files are in the directory listed in JP-2664. None of them have any negative source_ids, but almost all of them are > 99999. So that would make a good test to see the new "sxxxxx" fields in the output file names and SOURCEID values in the headers.

@tapastro
Copy link
Contributor Author
tapastro commented Jul 7, 2022

Some test data files are in the directory listed in JP-2664. None of them have any negative source_ids, but almost all of them are > 99999. So that would make a good test to see the new "sxxxxx" fields in the output file names and SOURCEID values in the headers.

I had data from the same PID with negative values (which spawned this ticket), and I find similar results as found for the >5 digit numbers - source_ids are created starting from 00001 and work their way up. Loaded datamodels show matching values. Results are in the ticket directory (for sources until I lost VPN connection and broke the processing run).

@hbushouse
Copy link
Collaborator

Some test data files are in the directory listed in JP-2664. None of them have any negative source_ids, but almost all of them are > 99999. So that would make a good test to see the new "sxxxxx" fields in the output file names and SOURCEID values in the headers.

I had data from the same PID with negative values (which spawned this ticket), and I find similar results as found for the >5 digit numbers - source_ids are created starting from 00001 and work their way up. Loaded datamodels show matching values. Results are in the ticket directory (for sources until I lost VPN connection and broke the processing run).

Yeah, the earlier version of the data that you had was created using a bad MSA meta file that had the source_id column of the shutters table defined as int16 instead of int32. So all of the source_ids bigger than the int16 limit rolled over to negative numbers. The ones in the data directory now were reprocessed using a corrected MSA metafile from Mike, which made all the negative values go away. But legit negative values can still occur for slits that don't have a real source from the catalog.

Copy link
Collaborator
@stscieisenhamer stscieisenhamer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Is the reason to limit to 5 digits only to prevent complications down the line, specifically archive? Just curious.

@hbushouse
Copy link
Collaborator

Is the reason to limit to 5 digits only to prevent complications down the line, specifically archive? Just curious.

Yes. I believe the archive ingest is setup to expect 5, and only 5, digits in the source id field of the file names and I don't think anyone has checked in detail yet to know whether Cal is OK with creating ones that have more than 5 digits. Haven't actually checked the part(s) of the code yet that fill in the {source_id} part of product names somewhere in level 3 processing. Do you happen to recall where that's done?

@hbushouse
Copy link
Collaborator
hbushouse commented Jul 8, 2022

Ah hah. Found it: https://github.com/spacetelescope/jwst/blob/master/jwst/associations/lib/rules_level3_base.py#L621 calwebb_spec3 calls format_product, which definitely has the source_id field in the product file names hardwired to 5 digits.

Ack! Never mind. Look what happens:

In [4]: output_file='jw01117-o007_{source_id}_nirspec_clear-prism'
In [5]: format_product(output_file,source_id=3)
Out[5]: 'jw01117-o007_s00003_nirspec_clear-prism'
In [8]: format_product(output_file,source_id=1234567)
Out[8]: 'jw01117-o007_s1234567_nirspec_clear-prism'

So it appears to be more than happy to take ids longer than 5 digits, despite the formatting rule.

Copy link
Collaborator
@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good. Results look good (including the internal SOURCEID keyword values). I approve.

@hbushouse hbushouse merged commit 086ee33 into spacetelescope:master Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0