-
Notifications
You must be signed in to change notification settings - Fork 120
Extending options for populating Taken Date from file-metadata #4457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
f17c81e
temp loader behaviour to analyse takenDate setting behaviour
AndyKilmory 7a7ca6c
temp loader behaviour to analyse takenDate setting behaviour
AndyKilmory 3e7646a
testing additional taken date mappings
AndyKilmory 6927dfb
removing unwanted blank line
AndyKilmory bb33562
Merge branch 'main' into t1722-taken-date
AndyKilmory fae8783
Merge branch 'main' into t1722-taken-date
AndyKilmory df31e31
temp code to review Taken Date setting process
AndyKilmory 6a1555a
adding Time Created to IptcAlt date created/date taken construct to g…
AndyKilmory 8beca7b
correct for UTC offset element in Created Time iptc element
AndyKilmory c08fcb9
correct syntax error in use of iptc Created Time
AndyKilmory b5cb654
using pre evaluated options to set date taken
AndyKilmory ce1ee98
Merge branch 'main' into t1722-taken-date
AndyKilmory 083e538
now working to account for feasible datetime values based on local im…
AndyKilmory 7aba2b8
Merge branch 'main' into t1722-taken-date
AndyKilmory 74505f9
updated modifications to setting of taken date - removed use of xmp:C…
AndyKilmory acf408a
Merge branch 'main' into t1722-taken-date
AndyKilmory 81c49d5
Merge branch 'main' into t1722-taken-date
AndyKilmory File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been kicking around the idea of setting the time portion to 0 to help communicate that while we're still willing to believe in the date the picture was taken, we're much less certain of the time... the logic becomes a bit more knotty, but how does this look to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Andrew, thanks for the feedback - I'm not against your suggested changes but I don't think they necessarily fix the issue - just mask it in a different way. The underlying problem relates to the time information we are being sent in images from agencies - in most cases it appears that the timestamp on the image relates to the time taken in the local timezone, unfortunately no information about the timezone is included in this information - so the code just assumes UTC as a 'best guess'. When this partial time information is coupled to the upload time of the image, and the lag between the image being taken and being sent across by the agency, we get the current peculiar behaviour of some images - notably those from Australia, China and the Far East - having the Taken Date left unset. Images from Eastern Europe and the Middle East sometimes end up without a taken date and sometimes have taken date set - depending on the lag between image taken and image upload. Whereas images form the US almost always get Taken Date set as they are behind UTC. However, the vast majority of images - whether we set Taken Date or not - are subject to the same problem - the time element is normally local time and we're interpreting as UTC.
My concerns with the approach of nulling time if it appears 'incorrect' is that will suggest that we're confident about the time taken on other Taken Date values - but they are quite probably just as 'wrong' as those that we are nulling - its just that the lag between time taken and upload time makes the time appear 'correct' or that the image was taken in a timezone behind UTC. So I'm not sure if we're really adding that value by adding an arbitrary rule in this way? Perhaps we need to add in some guidance into interpreting the time element of taken date? Users could user the location context of the picture to take a more nuanced view of the time taken information. We would also be losing information regarding sequencing within a particular group of pictures - which could be of value to users?
I have no strong feelings about the final solution we opt for - I would like to see us make some changes so that we're getting taken date set on more of the images - I'd be happy to go with your suggestion - it looks good from a technical perspective - but I'd just like some business input into the decision making process. I'll raise it at standup today and see what (if any) consensus there is in the BBC. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I agree entirely - my only point is that this is the only case where we know for sure that we don't know what the correct time would be. The rest of the time we've no way of checking whether the time is correct, so we have to trust the supplier has set the right date.
That said its a bit messy either way, so I'm very happy to merge as is while we think it through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both, very good points all around (eg. ordering!). I also don’t have strong feelings. But would just like to record some related thoughts. In general, I don’t think we should go too far in inventing data we don’t actually have…
January
seen next to01
indicating a potential partial date, there is absolutely no way of knowing we didn’t get the timezone+saving)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi both, I spoke with the team at standup about this issue. David's feeling was that it was somewhat arbitrary to exlcude the time from images coming from the far east due to the timezone issues meaning they appear to be taken in the future, but to accept times from the US because they happen to be running behind UTC - both values are equally 'incorrect'. His sense (and that of the team as a whole) was that we either allow feasible times to be included (we'll still not set Taken Date when agencies - PA - are misusing the field) or we exclude times from all images where we don't have proper timezone information in the field to allow us to interpret it correctly (this will mean nulling time for a lot of agency images and losing the internal sequence information for any group of images). I think in the end the preference was to include time when its feasible from a timezone perspective as originally suggested. Perhaps we could have a talk through these issues at the next grid hour? Without the agencies providing better data we're always going to be compromised - but that is a very long term aspiration!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Andrew for approval