-
Notifications
You must be signed in to change notification settings - Fork 541
Update Unstructured Provider, Fix Type Error in SDK #2103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to dbc0495 in 2 minutes and 49 seconds
More details
- Looked at
122
lines of code in7
files - Skipped
0
files when reviewing. - Skipped posting
15
drafted comments based on config settings.
1. docker/compose.full.swarm.yaml:388
- Draft comment:
Image updated to 1.0.3. Verify that the new version is backwards compatible. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. docker/compose.full.yaml:196
- Draft comment:
Image updated to 1.0.3; ensure environment consistency with other compose files. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. docker/compose.yaml:52
- Draft comment:
r2r-dashboard image updated to 1.0.3. This should align with other configs. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
4. py/core/providers/ingestion/r2r/base.py:259
- Draft comment:
Remove commented-out hardcoded parser override code if no longer needed. - Reason this comment was not posted:
Comment looked like it was already resolved.
5. py/pyproject.toml:7
- Draft comment:
Version bumped from 3.5.7 to 3.5.8; ensure changelog is updated accordingly. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
6. py/sdk/sync_methods/documents.py:65
- Draft comment:
Updated type hint for ingestion_mode to allow IngestionMode | str. Verify associated documentation matches this change. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to verify that the documentation matches the code change. This falls under the rule of not asking the author to ensure or verify things. The comment does not provide a specific suggestion or point out a potential issue with the code itself.
7. docker/compose.full.swarm.yaml:388
- Draft comment:
Updated r2r-dashboard image to version 1.0.3. Confirm that the new version is compatible with dependent services. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
8. docker/compose.full.yaml:193
- Draft comment:
Ensure consistent update of r2r-dashboard image to version 1.0.3 across all compose configurations. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
9. docker/compose.yaml:50
- Draft comment:
r2r-dashboard image updated to version 1.0.3 here as well. Verify consistency across environments. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
10. py/core/providers/ingestion/r2r/base.py:259
- Draft comment:
Removed hardcoded parser override check. Ensure that fallback processing for 'zerox' PDF override is intentionally relaxed and consistent with other providers. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
11. py/core/providers/ingestion/unstructured/base.py:293
- Draft comment:
Added explicit handling for parser_override values 'zerox' and 'ocr'. Consider adding an 'else' branch to log or handle unexpected override values. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
12. py/pyproject.toml:7
- Draft comment:
Version bumped to 3.5.8. Ensure that the changelog and release notes are updated appropriately. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
13. py/sdk/sync_methods/documents.py:65
- Draft comment:
Updated 'ingestion_mode' type annotation to accept both IngestionMode and str to fix type errors. Verify that downstream usage correctly handles both types. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
14. py/core/providers/ingestion/r2r/base.py:222
- Draft comment:
Typographical error: Consider renaming 'text_spliiter' to 'text_splitter' for clarity and consistency. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
15. py/core/providers/ingestion/unstructured/base.py:416
- Draft comment:
Typographical error: The word 'inadvertedly' in the TODO comment should be corrected to 'inadvertently'. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_ZsHvBUDRFqYo7csL
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
parser_name=f"zerox_{DocumentType.PDF.value}", | ||
): | ||
elements.append(element) | ||
if parser_overrides[document.document_type.value] == "zerox": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider consolidating duplicated logic in the added branch for OCR override and ensure warning message clarity.
Important
Update
r2r-dashboard
image version, add OCR parser support, and fix type error in SDK.r2r-dashboard
image version to1.0.3
incompose.full.swarm.yaml
,compose.full.yaml
, andcompose.yaml
.R2RIngestionProvider
inr2r/base.py
.ocr
parser override inUnstructuredIngestionProvider
inunstructured/base.py
.ingestion_mode
increate()
method indocuments.py
to acceptIngestionMode | str
.3.5.8
inpyproject.toml
.This description was created by
for dbc0495. It will automatically update as commits are pushed.