-
-
Notifications
You must be signed in to change notification settings - Fork 73
Add torrent upload feature #1046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request adds the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Endpoint as upload_torrent_file
participant Bencoder as bencodepy
participant Processor as _process_infohash
Client->>Endpoint: POST /scrape/upload_torrent with torrent file
Endpoint->>Endpoint: Validate file extension (.torrent)
Endpoint->>Bencoder: Decode torrent file contents
Bencoder-->>Endpoint: Return extracted infohash
Endpoint->>Processor: Process infohash with item ID & source description
Processor-->>Endpoint: Return session response
Endpoint-->>Client: 200 OK with session data
Poem
Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
src/routers/secure/items.py (1)
611-627
:⚠️ Potential issueThe torrent upload endpoint is incomplete.
While the endpoint successfully extracts the infohash from the uploaded torrent file, it doesn't actually do anything with it. The commented code suggests an intention to process the infohash using an existing workflow, but this implementation is missing.
Additionally, there are two issues flagged by static analysis:
- Don't use
File(...)
in parameter defaults. Instead, define it within the function body.- Within the
except
clause, usefrom e
to preserve the exception chain.Apply this fix:
-async def upload_torrent(request: Request, file: UploadFile = File(...)) -> MessageResponse: +async def upload_torrent(request: Request, file: UploadFile) -> MessageResponse: try: torrent_data = await file.read() infohash = parse_torrent_file(torrent_data) - # Use the existing workflow for processing the extracted infohash - # Assuming there is a function to handle the infohash - # process_infohash(infohash) + # TODO: Implement the actual processing of the infohash + # Consider using the same workflow as other endpoints that handle infohashes return {"message": f"Successfully uploaded and processed .torrent file with infohash {infohash}"} except Exception as e: - raise HTTPException(status_code=400, detail=f"Failed to process .torrent file: {str(e)}") + raise HTTPException(status_code=400, detail=f"Failed to process .torrent file: {str(e)}") from eAlso consider adding validation to ensure the uploaded file is actually a torrent file:
if not file.filename.endswith('.torrent'): raise HTTPException(status_code=400, detail="Uploaded file must be a .torrent file")🧰 Tools
🪛 Ruff (0.8.2)
617-617: Do not perform function call
File
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
626-626: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
🧹 Nitpick comments (3)
src/tests/test_torrent_upload.py (3)
1-1
: Remove unused importThe
pytest
import is not being used in this file.-import pytest
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
pytest
imported but unusedRemove unused import:
pytest
(F401)
7-14
: Test ensures valid torrent uploads work correctlyThe test properly validates that the torrent upload endpoint accepts valid torrent files and returns a 200 status code with a success message.
Consider using path variables or fixtures for test file paths to make the tests more robust:
- with open("tests/test.torrent", "rb") as torrent_file: + # Use pathlib for more reliable file paths + from pathlib import Path + test_file = Path(__file__).parent / "test.torrent" + with open(test_file, "rb") as torrent_file:
16-23
: Test ensures invalid torrent uploads are rejectedThis test correctly verifies that invalid torrent files are rejected with a 400 status code and an appropriate error message.
Apply the same path improvement as suggested for the valid torrent test:
- with open("tests/invalid.torrent", "rb") as torrent_file: + # Use pathlib for more reliable file paths + from pathlib import Path + invalid_test_file = Path(__file__).parent / "invalid.torrent" + with open(invalid_test_file, "rb") as torrent_file:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/program/services/downloaders/alldebrid.py
(3 hunks)src/program/services/downloaders/models.py
(1 hunks)src/program/services/downloaders/realdebrid.py
(2 hunks)src/program/services/downloaders/shared.py
(2 hunks)src/routers/secure/items.py
(3 hunks)src/tests/test_torrent_upload.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
src/program/services/downloaders/realdebrid.py (2)
src/program/services/downloaders/shared.py (3)
DownloaderBase
(17-93)parse_torrent_file
(121-126)add_torrent
(46-59)src/program/services/downloaders/alldebrid.py (2)
add_torrent
(166-199)execute
(39-43)
src/program/services/downloaders/alldebrid.py (1)
src/program/services/downloaders/shared.py (2)
parse_torrent_file
(121-126)add_torrent
(46-59)
🪛 Ruff (0.8.2)
src/program/services/downloaders/shared.py
125-125: Undefined name encode
(F821)
src/tests/test_torrent_upload.py
1-1: pytest
imported but unused
Remove unused import: pytest
(F401)
src/routers/secure/items.py
617-617: Do not perform function call File
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
626-626: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (10)
src/program/services/downloaders/models.py (1)
164-169
: LGTM - UploadedTorrent model looks well designedThe new
UploadedTorrent
class follows good design practices with appropriate attributes and types. The use ofField(default_factory=datetime.utcnow)
for theuploaded_at
field is correct as it properly creates a new timestamp for each instance.src/program/services/downloaders/shared.py (2)
4-5
: LGTM - Appropriate imports for torrent parsingThe added imports support the new torrent parsing functionality.
121-126
:✅ Verification successful
Fix missing
encode
functionThe
encode
function is used on line 125 but is not defined or imported. This will cause a runtime error.-from bencodepy import decode +from bencodepy import decode, encode- infohash = hashlib.sha1(encode(info)).hexdigest() + # Make sure to import encode from bencodepy or implement the encoding here + infohash = hashlib.sha1(encode(info)).hexdigest()
🏁 Script executed:
#!/bin/bash # Check if bencodepy provides an encode function pip show bencodepy pip install bencodepy python -c "from bencodepy import encode; print('encode function exists')" || echo "encode function not found"Length of output: 986
Import
encode
from bencodepy to prevent a runtime errorThe verification confirmed that the installed version of bencodepy (v0.9.5) exports an
encode
function. To avoid issues when callingencode(info)
in this file, please update the import as follows:
- Change:
from bencodepy import decode- To:
from bencodepy import decode, encodeAlso, consider adding an inline comment in the function to note that the
encode
function is imported from bencodepy. This update will resolve the issue of the missing function.🧰 Tools
🪛 Ruff (0.8.2)
125-125: Undefined name
encode
(F821)
src/program/services/downloaders/alldebrid.py (3)
25-25
: LGTM - Correctly imports the new parse_torrent_file functionThe new import gives this class access to the torrent parsing functionality.
166-170
: LGTM - Updated method signature allows torrent file uploadsThe function signature has been updated to accept an optional torrent_data parameter, and the docstring clearly explains the new functionality.
175-187
: LGTM - Implementation properly handles both infohash and torrent file uploadsThe code now intelligently handles both upload methods:
- If torrent_data is provided, it parses the torrent file and uses the POST endpoint
- Otherwise, it uses the existing GET endpoint with the infohash
This maintains backward compatibility while adding the new feature.
src/routers/secure/items.py (2)
7-7
: Imports updated correctly for new functionality.The necessary FastAPI imports for handling file uploads (
UploadFile
andFile
) have been added properly.
24-24
: Import for torrent parsing function added correctly.The
parse_torrent_file
function is properly imported from the shared module.src/program/services/downloaders/realdebrid.py (2)
24-24
: Import updated correctly.The import statement now correctly includes
parse_torrent_file
from the shared module.
200-216
:✅ Verification successful
Implementation looks good but has a potential issue in error handling.
The method has been correctly updated to handle both infohash and torrent file uploads. However, there's a potential issue in the error handling code at line 230, where
magnet
might not be defined iftorrent_data
is provided.Apply this fix to ensure the error is properly handled regardless of which path was taken:
def add_torrent(self, infohash: str, torrent_data: Optional[bytes] = None) -> str: """Add a torrent by infohash or .torrent file""" try: if torrent_data: infohash = parse_torrent_file(torrent_data) response = self.api.request_handler.execute( HttpMethod.POST, "torrents/addTorrent", files={"file": ("torrent", torrent_data, "application/x-bittorrent")} ) else: magnet = f"magnet:?xt=urn:btih:{infohash}" response = self.api.request_handler.execute( HttpMethod.POST, "torrents/addMagnet", data={"magnet": magnet.lower()} ) return response["id"] except Exception as e: if len(e.args) > 0: if " 503 " in e.args[0]: logger.debug(f"Failed to add torrent {infohash}: [503] Infringing Torrent or Service Unavailable") raise RealDebridError("Infringing Torrent or Service Unavailable") elif " 429 " in e.args[0]: logger.debug(f"Failed to add torrent {infohash}: [429] Rate Limit Exceeded") raise RealDebridError("Rate Limit Exceeded") elif " 404 " in e.args[0]: logger.debug(f"Failed to add torrent {infohash}: [404] Torrent Not Found or Service Unavailable") raise RealDebridError("Torrent Not Found or Service Unavailable") elif " 400 " in e.args[0]: - logger.debug(f"Failed to add torrent {infohash}: [400] Torrent file is not valid. Magnet: {magnet}") + logger.debug(f"Failed to add torrent {infohash}: [400] Torrent file is not valid.") raise RealDebridError("Torrent file is not valid")Also, this method signature change could impact other parts of the codebase that call this method.
🏁 Script executed:
#!/bin/bash # Find all instances of the add_torrent method being called to ensure they're compatible with the new signature echo "Searching for calls to add_torrent method..." rg "\.add_torrent\(" --type py | grep -v "def add_torrent"Length of output: 645
Critical error handling issue resolved in
add_torrent
The update correctly removes the undefined reference to
magnet
from the error log for the 400 error branch, ensuring that the exception message is built without relying on a variable that may not be defined whentorrent_data
is provided. Additionally, a search across the codebase confirms that all calls toadd_torrent
supply just theinfohash
(letting the optionaltorrent_data
default toNone
), so no unintended breaking changes have been introduced.
- Affected file:
src/program/services/downloaders/realdebrid.py
(error handling update at ~line 230)- Verified Call Sites:
src/routers/secure/scrape.py
src/program/services/downloaders/__init__.py
src/program/services/downloaders/alldebrid.py
89e9266
to
11379dd
Compare
@Gaisberg I've changed the implementation to leverage existing logic. Couldn't test yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/routers/secure/scrape.py (1)
476-481
: Move import statement to the top of the file.The
from bencodepy import decode, encode
import should be moved to the top of the file with other imports. While placing it inside the try block does handle potential import errors, it's more conventional and cleaner to keep all imports together at the top.import hashlib +from bencodepy import decode, encode from fastapi import APIRouter, BackgroundTasks, File, HTTPException, Request, UploadFile from loguru import logger ... # Inside the function try: - from bencodepy import decode, encode torrent_dict = decode(torrent_content) info = torrent_dict[b'info'] info_hash = hashlib.sha1(encode(info)).hexdigest()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (4)
pyproject.toml
(2 hunks)src/routers/secure/scrape.py
(4 hunks)src/tests/test_alldebrid_downloader.py
(1 hunks)src/tests/test_scrape_endpoints.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
src/tests/test_alldebrid_downloader.py (1)
src/program/services/downloaders/alldebrid.py (1)
AllDebridDownloader
(64-256)
src/routers/secure/scrape.py (1)
src/program/services/indexers/trakt.py (2)
TraktIndexer
(14-122)run
(55-89)
🪛 Ruff (0.8.2)
src/tests/test_scrape_endpoints.py
1-1: No such file or directory (os error 2)
(E902)
🔇 Additional comments (13)
src/tests/test_alldebrid_downloader.py (2)
2-3
: New imports for mock testing and file upload handling.The added imports for
hashlib
,MagicMock
, andpatch
will support hash generation and mocking in tests, which aligns with the torrent upload feature implementation.
6-7
: FastAPI imports for file upload testing.These imports are necessary for testing the new torrent upload functionality, allowing proper handling of test client operations and file upload testing.
pyproject.toml (2)
61-61
: Added python-multipart to dev dependencies.Including the same package in dev dependencies ensures it's available during development and testing. This is good practice for consistency between environments.
45-45
:✅ Verification successful
Added python-multipart dependency for file upload support.
The addition of
python-multipart
is necessary for handling uploaded torrent files in the new endpoint. This is a runtime dependency required by FastAPI for form data processing.
🌐 Web query:
Is version 0.0.20 of python-multipart the latest stable version?
💡 Result:
As of March 2025, version 0.0.20 of Python-Multipart is the latest stable release, with sources confirming its status across multiple platforms:
- PyPI listing (Dec 2024) explicitly shows 0.0.20 as the latest version[3][6]
- GitHub releases confirm version 0.0.20 (Dec 2024) as most recent, with active development continuing under Kludex's repository[6][9]
- Linux distributions package this version as current:
- Ubuntu's "Plucky Puffin" branch (active development) uses 0.0.20-1[1]
- Arch Linux lists 0.0.20-1 as latest in Extra repository[4]
- Debian's package tracker references 0.0.20-1 in main/sid[11]
Key Details
Release timeline from changelogs:
0.0.20 (2024-12-16) - Latest - Bugfix: Boundary handling improvements 0.0.19 (2024-11-30) 0.0.18 (2024-11-28) ... ```[9] The package maintains **100% test coverage** and follows semantic versioning with strict type checking since 0.0.12[9][10]. While another `multipart` package exists at version 1.2.1[5], this is a separate project with different maintainers. Citations: - 1: https://launchpad.net/ubuntu/+source/python-multipart - 2: https://packages.guix.gnu.org/packages/python-multipart/0.0.20/ - 3: https://pypi.org/project/python-multipart/ - 4: https://archlinux.org/packages/extra/any/python-python-multipart/ - 5: https://pypi.org/project/multipart/ - 6: https://github.com/andrew-d/python-multipart/releases - 7: https://multipart.readthedocs.io - 8: https://devguide.python.org/versions/ - 9: https://github.com/Kludex/python-multipart/blob/master/CHANGELOG.md - 10: https://github.com/Kludex/python-multipart - 11: https://tracker.debian.org/python-multipart --- </details> **Python-multipart dependency version verified** File: pyproject.toml, Line 45: - The addition of `python-multipart = "^0.0.20"` is correct, as version 0.0.20 is confirmed to be the latest stable release. - This dependency is essential for handling file uploads with FastAPI’s form data processing. </details> <details> <summary>src/tests/test_scrape_endpoints.py (5)</summary> `1-15`: **Good test setup with appropriate imports and client initialization.** The test file properly imports necessary modules and sets up the test client. This establishes the groundwork for testing the new torrent upload endpoint. <details> <summary>🧰 Tools</summary> <details> <summary>🪛 Ruff (0.8.2)</summary> 1-1: No such file or directory (os error 2) (E902) </details> </details> --- `20-33`: **Well-structured fixture for mocking bencodepy and hash generation.** The `mock_bencodepy` fixture correctly mocks the bencodepy library functions and hashlib's SHA1 functionality, ensuring consistent test behavior with a predetermined hash value that matches existing test data. --- `35-66`: **Comprehensive test service mocking.** The mock_services fixture properly sets up mocks for TraktIndexer, MediaItem, Downloader, and BackgroundTasks, along with resetting the session manager before each test. This ensures tests run in isolation and with predictable behavior. --- `67-102`: **Well-designed test for successful torrent file upload.** This test thoroughly validates the successful upload path, including: - Creating a fake torrent file - Mocking the necessary program services - Verifying the response status, message, session ID, and torrent ID - Confirming that _process_infohash was called with correct parameters The test provides excellent coverage of the happy path flow. --- `103-129`: **Good test coverage for invalid file upload scenarios.** This test appropriately verifies that the endpoint properly rejects non-torrent files with a 400 status code and meaningful error message. The assertion that _process_infohash is not called ensures the validation is working as expected. </details> <details> <summary>src/routers/secure/scrape.py (4)</summary> `6-6`: **Added hashlib import for torrent infohash calculation.** The hashlib library is required for computing SHA1 hashes from torrent file contents. --- `8-8`: **Updated imports to support file uploads.** Added File and UploadFil 8000 e imports from FastAPI to handle uploaded torrent files. --- `189-245`: **Well-structured refactoring to extract common infohash processing logic.** The _process_infohash function effectively consolidates duplicated logic that was previously in start_manual_session, making the code more maintainable and easier to extend. The function handles: - Item identification via IMDb or database ID - Service initialization validation - Torrent availability checking - Session creation and management - Error handling This refactoring follows good software engineering practices by reducing code duplication. --- `315-315`: **Updated start_manual_session to use the refactored helper function.** This change simplifies the function by delegating the processing to the new _process_infohash helper function. </details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/routers/secure/scrape.py (1)
464-464
: Minor static analysis warning about function call in default argument.The static analyzer flagged using
File(...)
as a default argument. While this is a common pattern in FastAPI, it's technically executing a function at module load time rather than function call time.-async def upload_torrent_file( - request: Request, - background_tasks: BackgroundTasks, - item_id: str, - torrent_file: UploadFile = File(...), -) -> StartSessionResponse: +async def upload_torrent_file( + request: Request, + background_tasks: BackgroundTasks, + item_id: str, + torrent_file: UploadFile, +) -> StartSessionResponse:And update the FastAPI route decorator to specify the parameter as a file using FastAPI's dependency injection system instead.
🧰 Tools
🪛 Ruff (0.8.2)
464-464: Do not perform function call
File
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/routers/secure/scrape.py
(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/routers/secure/scrape.py (3)
src/program/services/indexers/trakt.py (2)
TraktIndexer
(14-122)run
(55-89)src/program/db/db_functions.py (1)
get_item_by_id
(23-44)src/program/services/downloaders/shared.py (3)
get_instant_availability
(30-41)add_torrent
(44-57)get_torrent_info
(71-81)
🪛 Ruff (0.8.2)
src/routers/secure/scrape.py
234-234: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
464-464: Do not perform function call File
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
496-496: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (3)
src/routers/secure/scrape.py (3)
6-8
: Appropriate import additions for the new torrent upload feature.The addition of
hashlib
and the expansion of FastAPI imports to includeFile
andUploadFile
properly support the new torrent file upload functionality.
315-315
: Good refactoring to use the centralized infohash processing.The simplified implementation of
start_manual_session
properly delegates to the new helper function, reducing code duplication.
468-482
: Good implementation of file size validation to prevent DoS attacks.The chunk-based file reading approach efficiently handles large files by:
- Reading in manageable chunks (8192 bytes)
- Checking the accumulated size against the limit
- Rejecting files exceeding the 10MB limit
This is a security best practice for file upload endpoints.
async def _process_infohash( | ||
request: Request, | ||
background_tasks: BackgroundTasks, | ||
item_id: str, | ||
info_hash: str, | ||
source_description: str = "manual" | ||
) -> StartSessionResponse: | ||
"""Common logic for processing an infohash from either a magnet link or a torrent file""" | ||
# Identify item based on IMDb or database ID | ||
if item_id.startswith("tt"): | ||
imdb_id = item_id | ||
item_id = None | ||
else: | ||
imdb_id = None | ||
item_id = item_id | ||
|
||
if services := request.app.program.services: | ||
indexer = services[TraktIndexer] | ||
downloader = services[Downloader] | ||
else: | ||
raise HTTPException(status_code=412, detail="Required services not initialized") | ||
|
||
initialize_downloader(downloader) | ||
|
||
if imdb_id: | ||
prepared_item = MediaItem({"imdb_id": imdb_id}) | ||
item = next(indexer.run(prepared_item)) | ||
else: | ||
item = db_functions.get_item_by_id(item_id) | ||
|
||
if not item: | ||
raise HTTPException(status_code=404, detail="Item not found") | ||
|
||
container = downloader.get_instant_availability(info_hash, item.type) | ||
if not container or not container.cached: | ||
raise HTTPException(status_code=400, detail=f"Torrent is not cached, please try another {source_description}") | ||
|
||
session = session_manager.create_session(item_id or imdb_id, info_hash) | ||
|
||
try: | ||
torrent_id: str = downloader.add_torrent(info_hash) | ||
torrent_info: TorrentInfo = downloader.get_torrent_info(torrent_id) | ||
session_manager.update_session(session.id, torrent_id=torrent_id, torrent_info=torrent_info, containers=container) | ||
except Exception as e: | ||
background_tasks.add_task(session_manager.abort_session, session.id) | ||
raise HTTPException(status_code=500, detail=str(e)) | ||
|
||
data = { | ||
"message": f"Started manual scraping session from {source_description}", | ||
"session_id": session.id, | ||
"torrent_id": torrent_id, | ||
"torrent_info": torrent_info, | ||
"containers": container, | ||
"expires_at": session.expires_at.isoformat() | ||
} | ||
return StartSessionResponse(**data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Well-designed extraction of common infohash processing logic.
Creating the _process_infohash
helper function is a excellent refactoring that centralizes infohash handling logic for both magnet links and torrent file uploads. The code is well-structured with proper error handling, parame
8000
ter typing, and a clear docstring.
However, there's an error handling improvement that should be made:
- raise HTTPException(status_code=500, detail=str(e))
+ raise HTTPException(status_code=500, detail=str(e)) from e
This change preserves the original exception context, which is helpful for debugging and error tracing.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def _process_infohash( | |
request: Request, | |
background_tasks: BackgroundTasks, | |
item_id: str, | |
info_hash: str, | |
source_description: str = "manual" | |
) -> StartSessionResponse: | |
"""Common logic for processing an infohash from either a magnet link or a torrent file""" | |
# Identify item based on IMDb or database ID | |
if item_id.startswith("tt"): | |
imdb_id = item_id | |
item_id = None | |
else: | |
imdb_id = None | |
item_id = item_id | |
if services := request.app.program.services: | |
indexer = services[TraktIndexer] | |
downloader = services[Downloader] | |
else: | |
raise HTTPException(status_code=412, detail="Required services not initialized") | |
initialize_downloader(downloader) | |
if imdb_id: | |
prepared_item = MediaItem({"imdb_id": imdb_id}) | |
item = next(indexer.run(prepared_item)) | |
else: | |
item = db_functions.get_item_by_id(item_id) | |
if not item: | |
raise HTTPException(status_code=404, detail="Item not found") | |
container = downloader.get_instant_availability(info_hash, item.type) | |
if not container or not container.cached: | |
raise HTTPException(status_code=400, detail=f"Torrent is not cached, please try another {source_description}") | |
session = session_manager.create_session(item_id or imdb_id, info_hash) | |
try: | |
torrent_id: str = downloader.add_torrent(info_hash) | |
torrent_info: TorrentInfo = downloader.get_torrent_info(torrent_id) | |
session_manager.update_session(session.id, torrent_id=torrent_id, torrent_info=torrent_info, containers=container) | |
except Exception as e: | |
background_tasks.add_task(session_manager.abort_session, session.id) | |
raise HTTPException(status_code=500, detail=str(e)) | |
data = { | |
"message": f"Started manual scraping session from {source_description}", | |
"session_id": session.id, | |
"torrent_id": torrent_id, | |
"torrent_info": torrent_info, | |
"containers": container, | |
"expires_at": session.expires_at.isoformat() | |
} | |
return StartSessionResponse(**data) | |
async def _process_infohash( | |
request: Request, | |
background_tasks: BackgroundTasks, | |
item_id: str, | |
info_hash: str, | |
source_description: str = "manual" | |
) -> StartSessionResponse: | |
"""Common logic for processing an infohash from either a magnet link or a torrent file""" | |
# Identify item based on IMDb or database ID | |
if item_id.startswith("tt"): | |
imdb_id = item_id | |
item_id = None | |
else: | |
imdb_id = None | |
item_id = item_id | |
if services := request.app.program.services: | |
indexer = services[TraktIndexer] | |
downloader = services[Downloader] | |
else: | |
raise HTTPException(status_code=412, detail="Required services not initialized") | |
initialize_downloader(downloader) | |
if imdb_id: | |
prepared_item = MediaItem({"imdb_id": imdb_id}) | |
item = next(indexer.run(prepared_item)) | |
else: | |
item = db_functions.get_item_by_id(item_id) | |
if not item: | |
raise HTTPException(status_code=404, detail="Item not found") | |
container = downloader.get_instant_availability(info_hash, item.type) | |
if not container or not container.cached: | |
raise HTTPException(status_code=400, detail=f"Torrent is not cached, please try another {source_description}") | |
session = session_manager.create_session(item_id or imdb_id, info_hash) | |
try: | |
torrent_id: str = downloader.add_torrent(info_hash) | |
torrent_info: TorrentInfo = downloader.get_torrent_info(torrent_id) | |
session_manager.update_session(session.id, torrent_id=torrent_id, torrent_info=torrent_info, containers=container) | |
except Exception as e: | |
background_tasks.add_task(session_manager.abort_session, session.id) | |
- raise HTTPException(status_code=500, detail=str(e)) | |
+ raise HTTPException(status_code=500, detail=str(e)) from e | |
data = { | |
"message": f"Started manual scraping session from {source_description}", | |
"session_id": session.id, | |
"torrent_id": torrent_id, | |
"torrent_info": torrent_info, | |
"containers": container, | |
"expires_at": session.expires_at.isoformat() | |
} | |
return StartSessionResponse(**data) |
🧰 Tools
🪛 Ruff (0.8.2)
234-234: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
"/scrape/upload_torrent", | ||
summary="Upload a torrent file to start a manual scraping session", | ||
operation_id="upload_torrent_file" | ||
) | ||
async def upload_torrent_file( | ||
request: Request, | ||
background_tasks: BackgroundTasks, | ||
item_id: str, | ||
torrent_file: UploadFile = File(...), | ||
) -> StartSessionResponse: | ||
session_manager.cleanup_expired(background_tasks) | ||
|
||
# Check file size limit (10MB is typically more than enough for torrent files) | ||
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB | ||
file_size = 0 | ||
content = bytearray() | ||
|
||
# Read the file in chunks to check size without loading all into memory | ||
chunk = await torrent_file.read(8192) | ||
while chunk: | ||
file_size += len(chunk) | ||
if file_size > MAX_FILE_SIZE: | ||
raise HTTPException(status_code=413, detail="File too large, maximum size is 10MB") | ||
content.extend(chunk) | ||
chunk = await torrent_file.read(8192) | ||
|
||
torrent_content = bytes(content) | ||
|
||
# Validate the file is a .torrent file | ||
if not torrent_file.filename.endswith('.torrent'): | ||
raise HTTPException(status_code=400, detail="File must be a .torrent file") | ||
|
||
# Extract infohash from torrent file | ||
try: | ||
from bencodepy import decode, encode | ||
torrent_dict = decode(torrent_content) | ||
info = torrent_dict[b'info'] | ||
info_hash = hashlib.sha1(encode(info)).hexdigest() | ||
except Exception as e: | ||
logger.error(f"Failed to extract infohash from torrent file: {e}") | ||
raise HTTPException(status_code=400, detail="Invalid torrent file format") | ||
|
||
return await _process_infohash(request, background_tasks, item_id, info_hash, "uploaded torrent file") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Well-implemented torrent file upload endpoint with security considerations.
The upload_torrent_file
function properly:
- Validates file extension
- Implements file size limits (10MB) to prevent DoS attacks
- Reads files in chunks to avoid memory issues
- Safely extracts the infohash using bencodepy
- Delegates to the common processing function
However, two issues need to be addressed:
- raise HTTPException(status_code=400, detail="Invalid torrent file format")
+ raise HTTPException(status_code=400, detail="Invalid torrent file format") from e
Also, consider adding validation for the torrent content structure beyond just the file extension to ensure it's a valid torrent file before attempting to decode it.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@router.post( | |
"/scrape/upload_torrent", | |
summary="Upload a torrent file to start a manual scraping session", | |
operation_id="upload_torrent_file" | |
) | |
async def upload_torrent_file( | |
request: Request, | |
background_tasks: BackgroundTasks, | |
item_id: str, | |
torrent_file: UploadFile = File(...), | |
) -> StartSessionResponse: | |
session_manager.cleanup_expired(background_tasks) | |
# Check file size limit (10MB is typically more than enough for torrent files) | |
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB | |
file_size = 0 | |
content = bytearray() | |
# Read the file in chunks to check size without loading all into memory | |
chunk = await torrent_file.read(8192) | |
while chunk: | |
file_size += len(chunk) | |
if file_size > MAX_FILE_SIZE: | |
raise HTTPException(status_code=413, detail="File too large, maximum size is 10MB") | |
content.extend(chunk) | |
chunk = await torrent_file.read(8192) | |
torrent_content = bytes(content) | |
# Validate the file is a .torrent file | |
if not torrent_file.filename.endswith('.torrent'): | |
raise HTTPException(status_code=400, detail="File must be a .torrent file") | |
# Extract infohash from torrent file | |
try: | |
from bencodepy import decode, encode | |
torrent_dict = decode(torrent_content) | |
info = torrent_dict[b'info'] | |
info_hash = hashlib.sha1(encode(info)).hexdigest() | |
except Exception as e: | |
logger.error(f"Failed to extract infohash from torrent file: {e}") | |
raise HTTPException(status_code=400, detail="Invalid torrent file format") | |
return await _process_infohash(request, background_tasks, item_id, info_hash, "uploaded torrent file") | |
@ 1063A router.post( | |
"/scrape/upload_torrent", | |
summary="Upload a torrent file to start a manual scraping session", | |
operation_id="upload_torrent_file" | |
) | |
async def upload_torrent_file( | |
request: Request, | |
background_tasks: BackgroundTasks, | |
item_id: str, | |
torrent_file: UploadFile = File(...), | |
) -> StartSessionResponse: | |
session_manager.cleanup_expired(background_tasks) | |
# Check file size limit (10MB is typically more than enough for torrent files) | |
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB | |
file_size = 0 | |
content = bytearray() | |
# Read the file in chunks to check size without loading all into memory | |
chunk = await torrent_file.read(8192) | |
while chunk: | |
file_size += len(chunk) | |
if file_size > MAX_FILE_SIZE: | |
raise HTTPException(status_code=413, detail="File too large, maximum size is 10MB") | |
content.extend(chunk) | |
chunk = await torrent_file.read(8192) | |
torrent_content = bytes(content) | |
# Validate the file is a .torrent file | |
if not torrent_file.filename.endswith('.torrent'): | |
raise HTTPException(status_code=400, detail="File must be a .torrent file") | |
# Extract infohash from torrent file | |
try: | |
from bencodepy import decode, encode | |
torrent_dict = decode(torrent_content) | |
info = torrent_dict[b'info'] | |
info_hash = hashlib.sha1(encode(info)).hexdigest() | |
except Exception as e: | |
logger.error(f"Failed to extract infohash from torrent file: {e}") | |
- raise HTTPException(status_code=400, detail="Invalid torrent file format") | |
+ raise HTTPException(status_code=400, detail="Invalid torrent file format") from e | |
return await _process_infohash(request, background_tasks, item_id, info_hash, "uploaded torrent file") |
🧰 Tools
🪛 Ruff (0.8.2)
464-464: Do not perform function call File
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
496-496: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
@apollotsantos whats the status on this one? |
I stopped work on this as I didn't have enough time to dedicate. Feel free to close it. Hopefully I can pick it up again soon |
Summary by CodeRabbit
New Features
Enhancements