8000 Shanbady/summary flashcard sync by shanbady · Pull Request #2339 · mitodl/mit-learn · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Shanbady/summary flashcard sync #2339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

shanbady
Copy link
Contributor
@shanbady shanbady commented Jul 3, 2025

What are the relevant tickets?

Closes https://github.com/mitodl/hq/issues/7714

Description (What does it do?)

This PR ensures that content that already has an existing summary/flashcard stays in sync when content is updated

How can this be tested?

  1. Checkout this branch
  2. make sure QDRANT_ENABLE_INDEXING_PLUGIN_HOOKS is set to True in settings
  3. ensure you also have an open ai api key configured.
  4. configure a summarizer for an etl source locally (lets pick mitxonline) - you can reference the summarizer currently configured on production here https://api.learn.mit.edu/admin/learning_resources/contentsummarizerconfiguration/1/change/
  5. pull down contentfiles python manage.py backpopulate_mitxonline_files
  6. note that we dont have summaries for any contentfiles yet:
from learning_resources.models import *
ContentFile.objects.exclude(summary="").all()
  1. manually set the "summary" field for a contentfile to "test" - also change the checksum for the associated run and contentfile so it thinks we have an update.
from learning_resources.models import *
cf = ContentFile.objects.filter(file_extension=".srt").first()
run = cf.run
lr = run.learning_resource

cf.summary = "test"
cf.checksum = "test"
run.checksum = "test"
cf.save()
run.save()
print(f"resource id - {lr.id}")
  1. pull down contentfiles for that particular learning resource again python manage.py backpopulate_mitxonline_files ---resource-ids {resourceid}
  2. once complete you should see a generated summary populated for just that one contentfile

note: Dont forget to unset your openai api key once done testing.

Additional Context

On live systems, the ETL plugin hooks will call generate_embeddings with the overwrite flag which is what we are testing here.

@shanbady shanbady marked this pull request as ready for review July 3, 2025 14:57
@shanbady shanbady added Needs Review An open Pull Request that is ready for review Work in Progress and removed Needs Review An open Pull Request that is ready for review labels Jul 3, 2025
@shanbady shanbady marked this pull request as draft July 3, 2025 15:33
@shanbady shanbady added Needs Review An open Pull Request that is ready for review and removed Work in Progress labels Jul 10, 2025
@shanbady shanbady marked this pull request as ready for review July 10, 2025 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review An open Pull Request that is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0