8000 Add endpoint to upload AI topic & increase max topic length by kathy-t · Pull Request #5847 · dockstore/dockstore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add endpoint to upload AI topic & increase max topic length #5847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 28, 2024

Conversation

kathy-t
Copy link
Contributor
@kathy-t kathy-t commented Mar 21, 2024

Description
Related PRs:

This PR adds an extended TRS endpoint that uploads an AI topic for the entry identified by the TRS ID. If the manual and automatic topics are empty, then the topic selection is automatically set to AI. This new endpoint requires an admin or curator token.

I also removed for end users to update the topicAI field using the update endpoint since AI topics are updated by the topic generator.

This PR also increases the maximum topic length from 150 to 250 characters because there were AI generated topics that exceeded 150 characters. I don't see any harm in doing this because the UI truncates the topics if they exceed 2 lines on the search page. I created a UI PR to update the maxLength allowed for the manual topic to 250 dockstore/dockstore-ui2#1950.

Review Instructions
See instructions in dockstore/dockstore-support#489

Issue
SEAB-6007

Security and Privacy

If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.

  • Security and Privacy assessed

e.g. Does this change...

  • Any user data we collect, or data location?
  • Access control, authentication or authorization?
  • Encryption features?

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • Follow the existing JPA patterns for queries, using named parameters, to avoid SQL injection
  • If you are changing dependencies, check the Snyk status check or the dashboard to ensure you are not introducing new high/critical vulnerabilities
  • Assume that inputs to the API can be malicious, and sanitize and/or check for Denial of Service type values, e.g., massive sizes
  • Do not serve user-uploaded binary images through the Dockstore API
  • Ensure that endpoints that only allow privileged access enforce that with the @RolesAllowed annotation
  • Do not create cookies, although this may change in the future
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.
8000

@kathy-t kathy-t self-assigned this Mar 21, 2024
Copy link
codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 89.28571% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 73.33%. Comparing base (eed5840) to head (3c38c79).
Report is 1 commits behind head on develop.

Files Patch % Lines
...ces/proposedGA4GH/ToolsApiExtendedServiceImpl.java 82.35% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #5847      +/-   ##
=============================================
- Coverage      74.48%   73.33%   -1.15%     
+ Complexity      5247     5171      -76     
=============================================
  Files            366      367       +1     
  Lines          18974    18986      +12     
  Branches        2020     2016       -4     
=============================================
- Hits           14132    13924     -208     
- Misses          3886     4074     +188     
- Partials         956      988      +32     
Flag Coverage Δ
bitbuckettests 27.08% <10.71%> (-0.02%) ⬇️
integrationtests 58.44% <71.42%> (+0.03%) ⬆️
languageparsingtests 10.97% <7.14%> (-0.01%) ⬇️
localstacktests 21.57% <35.71%> (+<0.01%) ⬆️
toolintegrationtests 29.82% <10.71%> (-0.66%) ⬇️
unit-tests_and_non-confidential-tests 20.45% <0.00%> (-8.48%) ⬇️
workflowintegrationtests 38.70% <10.71%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kathy-t
Copy link
Contributor Author
kathy-t commented Mar 22, 2024

Belated thought: should the endpoint automatically set the TopicSelection as AI if none of the other topics are filled in?

@denis-yuen
Copy link
Member

Belated thought: should the endpoint automatically set the TopicSelection as AI if none of the other topics are filled in?

Makes sense, I guess they wouldn't be visible otherwise unless we did something tricky with elastic

Copy link
Member
@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On passing build

Copy link

@kathy-t
Copy link
Contributor Author
kathy-t commented Mar 25, 2024

Updated the PR so that if there is no manual or automatic topic, the topic selection is automatically set to AI.

Furthermore, I modified the updateWorkflow and updateContainer endpoint so that it does not update the topicAI field since AI topics are updated by the topic generator, not by end users.

@svonworl
Copy link
Contributor
svonworl commented Mar 26, 2024

Updated the PR so that if there is no manual or automatic topic, the topic selection is automatically set to AI.

Does this mean that in those cases, the AI topic automatically appears as the topic in search and/or on the entry page, without entry owner approval?

@kathy-t
Copy link
Contributor Author
kathy-t commented Mar 26, 2024

Updated the PR so that if there is no manual or automatic topic, the topic selection is automatically set to AI.

Does this mean that in those cases, the AI topic automatically appears in search and/or on the entry page, without entry owner approval?

Yeah, that's correct, but it should only be for published entries. Are you thinking it should be opt-in instead of opt-out?

@svonworl
Copy link
Contributor

Updated the PR so that if there is no manual or automatic topic, the topic selection is automatically set to AI.

Does this mean that in those cases, the AI topic automatically appears in search and/or on the entry page, without entry owner approval?

Yeah, that's correct, but it should only be for published entries. Are you thinking it should be opt-in instead of opt-out?

Mixed feelings, automatic inclusion is good for usability, but I feel like there's the possibility that owners might not always approve. Acceptable risk? Not sure.

@kathy-t kathy-t requested a review from denis-yuen March 26, 2024 19:49
@denis-yuen
Copy link
Member

Mixed feelings, automatic inclusion is good for usability, but I feel like there's the possibility that owners might not always approve. Acceptable risk? Not sure.

Well, this was the intent, to have the topic sentences show up as a last resort but be labelled as AI
dockstore/dockstore-ui2#1939

Many workflows should have a topic one way or another, harvested from the github repo, from the .dockstore.yml, or from the workflow descriptor itself

public class UpdateAITopicRequest {

@NotNull
@Schema(description = "The AI topic", requiredMode = RequiredMode.REQUIRED)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the maxLength be set, which would cause requests with longer than TOPIC_LENGTH to fail, or do we want it to get silently truncated, when you invoke abbreviateTopic in Entry.java?

Also, WorkflowResource.updateWorkflow(), which takes a workflow as a parameter, will I believe enforce a TOPIC_LIMIT by not deserializing a workflow with a greater than TOPIC_LENGTH ai topic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the silent truncation to match the behaviour of topicAutomatic, which silently truncates topics greater than TOPIC_LENGTH. I'll add a description saying that topics > TOPIC_LENGTH are truncated.

Also, WorkflowResource.updateWorkflow(), which takes a workflow as a parameter, will I believe enforce a TOPIC_LIMIT by not deserializing a workflow with a greater than TOPIC_LENGTH ai topic.

That's true if maxLength is set, so perhaps it's another reason why we shouldn't set it since the AI topic can't be updated with WorkflowResource.updateWorkflow() anymore

@kathy-t kathy-t merged commit 944ddfe into develop Mar 28, 2024
@kathy-t kathy-t deleted the feature/seab-6007/upload-ai-topics branch March 28, 2024 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0