8000 SEAB-7089: Fix slow-loading user profile "Activity" section by svonworl · Pull Request #6098 · dockstore/dockstore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

SEAB-7089: Fix slow-loading user profile "Activity" section #6098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 17, 2025

Conversation

svonworl
Copy link
Contributor
@svonworl svonworl commented Apr 14, 2025

Description
This PR changes the performance profile of the event retrieval endpoints, thus fixing the very slow load times of the "Activity" sections of certain user profile pages.

To retrieve events, the webservice synthesizes a CriteriaQuery, which, previously, results in a massive join:

select e1_0.id,e1_0.apptoolId,e1_0.collectionId,e1_0.dbCreateDate,e1_0.dbUpdateDate,e1_0.initiatorUserId,e1_0.notebookId,e1_0.organizationId,e1_0.serviceId,e1_0.toolId,e1_0.type,e1_0.userId,e1_0.versionId,v1_0.id,v1_0.clazz_,v1_0.commitID,v1_0.dbCreateDate,v1_0.dbUpdateDate,v1_0.dirtyBit,v1_0.frozen,v1_0.name,v1_0.parentid,v1_0.readMePath,v1_0.reference,v1_0.referenceType,v1_0.userFiles,v1_0.valid,v1_0.versioneditor_id,v1_0.automated,v1_0.cwlPath,v1_0.dockerfilePath,v1_0.imageId,v1_0.lastBuilt,v1_0.size,v1_0.wdlPath,v1_0.dagJson,v1_0.isLegacyVersion,v1_0.kernelImagePath,v1_0.lastModified,v1_0.synced,v1_0.toolTableJson,v1_0.workflowPath,e1_0.workflowId from event e1_0 left join Tool t1_0 on t1_0.id=e1_0.toolId left join workflow w1_0 on w1_0.id=e1_0.workflowId left join apptool a1_0 on a1_0.id=e1_0.apptoolId left join service s1_0 on s1_0.id=e1_0.serviceId left join notebook n1_0 on n1_0.id=e1_0.notebookId left join organization o1_0 on o1_0.id=e1_0.organizationId left join ( select id, commitID, dbCreateDate, dbUpdateDate, dirtyBit, frozen, name, readMePath, reference, referenceType, userFiles, valid, parentid, versioneditor_id, automated, cwlPath, dockerfilePath, imageId, lastBuilt, size, wdlPath, null::text as dagJson, null::boolean as isLegacyVersion, null::text as kernelImagePath, null::timestamp as lastModified, null::boolean as synced, null::text as toolTableJson, null::text as workflowPath, 1 as clazz_ from tag union all select id, commitID, dbCreateDate, dbUpdateDate, dirtyBit, frozen, name, readMePath, reference, referenceType, userFiles, valid, parentid, versioneditor_id, null::boolean as automated, null::text as cwlPath, null::text as dockerfilePath, null::text as imageId, null::timestamp as lastBuilt, null::bigint as size, null::text as wdlPath, dagJson, isLegacyVersion, kernelImagePath, lastModified, synced, toolTableJson, workflowPath, 2 as clazz_ from workflowversion ) v1_0 on v1_0.id=e1_0.versionId where e1_0.initiatorUserId in(?) and (e1_0.toolId is null and e1_0.workflowId is null and e1_0.apptoolId is null and e1_0.serviceId is null and e1_0.notebookId is null or t1_0.isPublished or w1_0.isPublished or a1_0.isPublished or s1_0.isPublished or n1_0.isPublished or e1_0.type in(?,?,?,?,?,?,?)) and (not(o1_0.categorizer) or e1_0.organizationId is null) order by e1_0.id desc offset ? rows fetch first ? rows only

The join "pulls in" lots of seemingly-unnecessary information, like versions, because, although it isn't being used to determine which Events to select, it's needed to construct fully-formed Events from the results.

At some point between 15/Oct/2024 and 2/Nov/2024, the database changed so that, for users with lots of events, this join performs poorly and induces the observed delay. That is, on the 15/Oct db, the query is fast, and on the 2/Nov db, it is very slow. I confirmed this behavior by loading prod db dumps onto my local 1.17 build.

I'm not sure of the root cause. Could be that the involved data now exceeds some sort of internal threshold/limit in the db server, causing the query planner to do a worse job, or inducing the server to use a different internal data structure that doesn't perform as well. Or maybe something else. IMHO, it's potentially not the best use of our time to investigate more deeply, could take a while and might not reveal anything actionable...

This PR changes the query to retrieve a list of Event IDs, simplifying the above join to:

select e1_0.id from event e1_0 left join Tool t1_0 on t1_0.id=e1_0.toolId left join workflow w1_0 on w1_0.id=e1_0.workflowId left join apptool a1_0 on a1_0.id=e1_0.apptoolId left join service s1_0 on s1_0.id=e1_0.serviceId left join notebook n1_0 on n1_0.id=e1_0.notebookId left join organization o1_0 on o1_0.id=e1_0.organizationId where e1_0.initiatorUserId in(?) and (e1_0.toolId is null and e1_0.workflowId is null and e1_0.apptoolId is null and e1_0.serviceId is null and e1_0.notebookId is null or t1_0.isPublished or w1_0.isPublished or a1_0.isPublished or s1_0.isPublished or n1_0.isPublished or e1_0.type in(?,?,?,?,?,?,?)) and (not(o1_0.categorizer) or e1_0.organizationId is null) order by 1 desc offset ? rows fetch first ? rows only

The resulting list of Event IDs is then converted to a list of Events in a subsequent query.

For users with many Events, the new scheme is much much faster. However, it's unchanged to negligibly slower for users that have few events. Wall clock time for http://localhost:4200/api/events/user-id?eventSearchType=PROFILE with this PR running locally, for different users:

denis-yuen (id 3): 0.45s
kathy-t (id 36641): 0.30s
bvizzier (id 56241): 0.21s
svonworl (id 62319): 0.30s

Same test, running on local develop build:

denis-yuen (id 3): 11.88s
kathy-t (id 36641): 9.25s
bvizzier (id 56241): 0.19s
svonworl (id 62319): 0.34s

Review Instructions
Confirm that the "Activity" section of the user profile pages is responding fast enough, and contains the correct information, sorted in the correct order. Make sure to check when you are both logged-in and logged-out. Suggested urls: https://qa.dockstore.org/users/denis-yuen etc

Issue
https://ucsc-cgl.atlassian.net/browse/SEAB-7089

Security and Privacy

If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.

  • Security and Privacy assessed

e.g. Does this change...

  • Any user data we collect, or data location?
  • Access control, authentication or authorization?
  • Encryption features?

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • Follow the existing JPA patterns for queries, using named parameters, to avoid SQL injection
  • If you are changing dependencies, check the Snyk status check or the dashboard to ensure you are not introducing new high/critical vulnerabilities
  • Assume that inputs to the API can be malicious, and sanitize and/or check for Denial of Service type values, e.g., massive sizes
  • Do not serve user-uploaded binary images through the Dockstore API
  • Ensure that endpoints that only allow privileged access enforce that with the @RolesAllowed annotation
  • Do not create cookies, although this may change in the future
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

@svonworl svonworl self-assigned this Apr 14, 2025
Copy link
codecov bot commented Apr 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.25%. Comparing base (3f6e5c7) to head (128d25d).

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #6098   +/-   ##
==========================================
  Coverage      74.24%   74.25%           
- Complexity      5663     5664    +1     
==========================================
  Files            389      389           
  Lines          20343    20346    +3     
  Branches        2103     2103           
==========================================
+ Hits           15104    15107    +3     
  Misses          4237     4237           
  Partials        1002     1002           
Flag Coverage Δ
bitbuckettests 26.02% <0.00%> (-0.01%) ⬇️
hoverflytests 27.65% <0.00%> (-0.01%) ⬇️
integrationtests 56.07% <100.00%> (+<0.01%) ⬆️
languageparsingtests 10.81% <0.00%> (-0.01%) ⬇️
localstacktests 21.32% <0.00%> (-0.01%) ⬇️
toolintegrationtests 29.89% <100.00%> (+0.01%) ⬆️
unit-tests_and_non-confidential-tests 26.27% <0.00%> (-0.01%) ⬇️
workflowintegrationtests 37.50% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member
@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!
Good find and puzzling out what happened. Seems like a reasonable explanation.

@denis-yuen
Copy link
Member

(retriggered build now that artifactory is back up)

@svonworl svonworl merged commit d2acb4e into develop Apr 17, 2025
24 checks passed
@svonworl svonworl deleted the feature/seab-7089/fix-slow-loading-user-profile branch April 17, 2025 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0