8000 Enable asynchronous image processing by stefpiatek · Pull Request #364 · SAFEHR-data/PIXL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Enable asynchronous image processing #364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 117 commits into from
Apr 12, 2024
Merged

Enable asynchronous image processing #364

merged 117 commits into from
Apr 12, 2024

Conversation

stefpiatek
Copy link
Contributor
@stefpiatek stefpiatek commented Apr 4, 2024

May not fully get there with documentation today but thought worth getting eyes on this. I've added stripped down log in reverse order (most recent at the top) for trying to work out what configuration should be set.

TODO:

  • Configure orthanc settings to be set using build args
  • Document configuration in READMEs
  • fix imaging api tests
  • fix system tests

Summary of learnings and changes

  • I think it was overloading the orthanc raw with pending jobs that caused it to freeze, which caused the imaging api to block and time out from rabbitmq heartbeat. We now check to see if
  • Early return in a job is failed in orthanc raw rather than waiting for timeout
  • use async rest API so that we're blocking less, and can resend images that already exists but haven't been exported asynchronously
  • Use loguru for logging, mostly because it gives nice defaults and is easy to configure and we don't have to mess around as much with it. Also allows fstring format style. We can roll this out to the other services gradually

Configuration changes

Will update as I go along. When rate drops, it's that there are timeouts from the VNA completing the transfer job within 2 minutes

  • after async rest
  • 100 max job history
  • 25 concurrent jobs
  • 5 dicom threads (default)
  • 50 http threads (default)
  • 50 messages in flight
  • 3 m/s from queue

Didn't really change much, expect that its VNA load based, ~25 messages were being nacked and requeued which makes sense at least.

image

Tried after 7pm and not much of a change

image


  • after async rest
  • 100 max job history
  • 50 concurrent jobs
  • 5 dicom threads (default)
  • 50 http threads (default)
  • 100 messages in flight
  • 3 m/s from queue

Doesn't seem to help, though was requeueing 50 messages a second so there's more capacity. Maybe try fewer concurrent jobs and messages in flight to see if that gets 10000 us back to ~2 messages per second being confirmed, otherwise may just be the rest of the load on the VNA

image
  • after async rest
  • 100 max job history
  • 50 concurrent jobs
  • 5 dicom threads (default)
  • 50 http threads (default)
  • 50 messages in flight
  • 3 m/s from queue

Didn't really give much of an increase, rarely hit rate limiting by the token bucket so maybe try increasing max in flight?

image
  • after async rest
  • 100 max job history
  • 30 concurrent jobs
  • 5 dicom threads (default)
  • 50 http threads (default)
  • 50 messages in flight
  • 3 m/s from queue

Number of concurrent jobs maybe increased? will try 50 to check

image
  • after async rest
  • 100 max job history
  • 10 concurrent jobs
  • 5 dicom threads (default)
  • 100 http threads
  • 100 messages in flight
  • 3 m/s from queue

max job history increase stopped error on jobs not being found, pending messages in orthanc slowing down processing so increase number of concurrent jobs to 30 next time

image
  • after async rest
  • 10 max job history (default)
  • 10 concurrent jobs
  • 5 dicom threads (default)
  • 100 http threads
  • 100 messages in flight
  • 3 m/s from queue

Found errors where the job didn't exist in orthanc anymore, so increase max job history so it still exists after success

image
  • after async rest
  • 10 max job history (default)
  • 50 concurrent jobs
  • 50 dicom threads
  • 200 http threads
  • 200 messages in flight
  • 3 m/s from queue

Not a massive increase, will drop number of threads and concurrent jobs

image
  • before async rest
  • 10 max job history (default)
  • 10 concurrent jobs
  • 10 dicom threads
  • 100 http threads
  • 5 messages in flight
  • rate 0.6 then 1 message/second

Initial rate of processing, discovered that was hanging a fair bit on rest calls so made them async using aio http

image

@stefpiatek stefpiatek requested review from milanmlft and a team April 12, 2024 08:36
@stefpiatek stefpiatek linked an issue Apr 12, 2024 that may be closed by this pull request
2 tasks
Copy link
Member
@milanmlft milanmlft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤘

@stefpiatek stefpiatek merged commit e22f855 into main Apr 12, 2024
@stefpiatek stefpiatek deleted the stef/async-consuming branch April 12, 2024 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet 48F3
Development

Successfully merging this pull request may close these issues.

Allow async consumption without blocking main thread
2 participants
0