-
Notifications
You must be signed in to change notification settings - Fork 4.4k
[Security Issue] Simplify Filename Sanitization in /api/download
Endpoint
#1503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
for sep in (os.sep, os.path.altsep): | ||
if sep: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The os
module is used in secure_filename()
but not imported at the top of the file, which will cause a NameError when the function is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shyun020 can you add some unittests?
🧪 Unit Tests for Path Traversal Defense LogicHello, I have added unit tests to verify that the path traversal defense logic in the Testing the secure_filename() Function
Testing the /api/download API
Filtering Bypass Attempts
Directory Compression Download Test
✅ Test EnvironmentFor testing purposes, the following files/directories were created:
#!/usr/bin/env python3
import os
import unicodedata
import zipfile
from pathlib import Path
import tempfile
import typing as t
# Patch for Pydantic v2 compatibility: Allow BaseModel subscripting
from pydantic import BaseModel
BaseModel.__class_getitem__ = classmethod(lambda cls, item: cls)
from fastapi import FastAPI, HTTPException, Response
from fastapi.responses import FileResponse
from fastapi.testclient import TestClient
# -------------------------------
# 🔹 FastAPI app setup
app = FastAPI()
# -------------------------------
# 🔹 Unified API response model
class APIResponse(BaseModel):
data: t.Optional[t.Any] = None
error: t.Optional[str] = None
traceback: t.Optional[str] = None
def model_dump_json(self) -> str:
import json
return json.dumps(self.dict())
# -------------------------------
# 🔹 API endpoint: file or directory download
@app.get("/api/download")
def _download_file_or_dir(file: t.Optional[str] = None):
if not file:
raise HTTPException(
status_code=400, detail="File path is required as query parameter"
)
# Sanitize and normalize file input
safe_file = secure_filename(file)
if not safe_file:
raise HTTPException(status_code=400, detail="Invalid file path after sanitization")
path = Path(safe_file)
if not path.exists():
return Response(
content=APIResponse[None](
data=None,
error=f"{path} not found",
).model_dump_json(),
status_code=404,
)
# If it's a file, return as-is
if path.is_file():
return FileResponse(path=path)
# If it's a directory, create a zip and return
tempdir = tempfile.TemporaryDirectory()
zipfile_path = Path(tempdir.name, path.name + ".zip")
return FileResponse(path=_archive(directory=path, output=zipfile_path))
# -------------------------------
# 🔹 Directory archiving logic
def _archive(directory: Path, output: Path) -> Path:
with zipfile.ZipFile(output, "w", zipfile.ZIP_DEFLATED) as fp:
for root, _, files in os.walk(directory):
for file in files:
fp.write(
os.path.join(root, file),
os.path.relpath(
os.path.join(root, file),
os.path.join(directory, ".."),
),
)
return output
# -------------------------------
# 🔹 File name sanitization to prevent path traversal
def secure_filename(filename: str) -> str:
filename = unicodedata.normalize("NFKD", filename)
filename = filename.encode("ascii", "ignore").decode("ascii")
for sep in (os.sep, os.path.altsep):
if sep:
filename = filename.replace(sep, " ")
filename = "_".join(filename.split())
filename = filename.strip("._")
return filename
# -------------------------------
# 🔹 FastAPI test client for unit testing
client = TestClient(app)
# -------------------------------
# 🔹 Run path injection test cases
def run_tests():
print("📌 Test secure_filename:")
# Test secure_filename with normal and malicious inputs
print(" normal: ", secure_filename("normal_file.txt"))
print(" traversal: ", secure_filename("../secret.txt"))
print(" whitespace: ", secure_filename(" "))
print("\n📌 Test download API:")
# Test when no file parameter is given
print(" No file param:")
r = client.get("/api/download")
print(" ->", r.status_code, r.text)
# Test with invalid (only whitespace) filename
print("\n Invalid file path (just spaces):")
r = client.get("/api/download", params={"file": " "})
print(" ->", r.status_code, r.text)
# Test when file does not exist
print("\n File not found:")
cwd = os.getcwd()
with tempfile.TemporaryDirectory() as tmpdir:
os.chdir(tmpdir)
r = client.get("/api/download", params={"file": "nonexistent.txt"})
print(" ->", r.status_code, r.text)
os.chdir(cwd)
# Test with a valid existing file
print("\n📌 Valid file reading (normal_file.txt):")
cwd = os.getcwd()
with tempfile.TemporaryDirectory() as tmpdir:
os.chdir(tmpdir)
with open("normal_file.txt", "w") as f:
f.write("Test content for valid file.")
r = client.get("/api/download", params={"file": "normal_file.txt"})
if r.status_code == 200:
print(" ->", r.status_code, r.content.decode())
else:
print(" ->", r.status_code, r.text)
os.chdir(cwd)
# -------------------------------
# 🔹 Test bypass attempts with traversal-like paths
print("\n📌 Filtering bypass attempts:")
# Case 1: "secret/../normal_file.txt"
print(" Attempt with 'secret/../normal_file.txt':")
cwd = os.getcwd()
with tempfile.TemporaryDirectory() as tmpdir:
os.chdir(tmpdir)
with open("normal_file.txt", "w") as f:
f.write("Test content for valid file.")
r = client.get("/api/download", params={"file": "secret/../normal_file.txt"})
print(" ->", r.status_code, r.text)
os.chdir(cwd)
# Case 2: Encoded backslash "..%5cnormal_file.txt"
print(" Attempt with encoded backslash ('..%5cnormal_file.txt'):")
cwd = os.getcwd()
with tempfile.TemporaryDirectory() as tmpdir:
os.chdir(tmpdir)
with open("normal_file.txt", "w") as f:
f.write("Test content for valid file.")
r = client.get("/api/download", params={"file": "..\\normal_file.txt"})
print(" ->", r.status_code, r.text)
os.chdir(cwd)
# Case 3: Various encoded traversal strings
encoded_test_cases = [
("%2e%2e%2f", "Encoded '../' pattern: %2e%2e%2f"),
("%2e%2e/", "Encoded '../' pattern: %2e%2e/"),
("..%2f", "Encoded '../' pattern: ..%2f"),
("%2e%2e%5c", "Encoded '..\\' pattern: %2e%2e%5c"),
("%2e%2e\\", "Encoded '..\\' pattern: %2e%2e\\"),
("..%5c", "Encoded '..\\' pattern: ..%5c"),
("%252e%252e%255c", "Double-encoded '..\\' pattern: %252e%252e%255c"),
("..%255c", "Double-encoded '..\\' pattern: ..%255c"),
]
print(" Various URL encoded patterns:")
for case, description in encoded_test_cases:
with tempfile.TemporaryDirectory() as tmpdir:
os.chdir(tmpdir)
with open("normal_file.txt", "w") as f:
f.write("Test content for valid file.")
r = client.get("/api/download", params={"file": case})
print(f" {description} ->", r.status_code, r.text)
os.chdir(cwd)
# -------------------------------
# 🔹 Test zip archive creation
print("\n📌 Test archive creation:")
with tempfile.TemporaryDirectory() as tmpdir:
test_dir = Path(tmpdir) / "testdir"
test_dir.mkdir()
(test_dir / "a.txt").write_text("hello")
(test_dir / "b.txt").write_text("world")
zip_path = Path(tmpdir) / "zipped.zip"
_archive(test_dir, zip_path)
print(" Archive created:", zip_path.exists())
with zipfile.ZipFile(zip_path) as z:
print(" Zip contents:", z.namelist())
# -------------------------------
# 🔹 Run all test cases when executed directly
if __name__ == "__main__":
run_tests() |
@angrybayblade I'd appreciate it if you could review the unit tests I wrote 🙏 |
How long will it take for the feedback? |
@angrybayblade I'd appreciate it if you could review the unit tests I wrote 🙏 |
Thank you very much for taking the time to review my work. May I kindly ask if there are any errors in the code I submitted via the Pull Request? If so, could you please let me know what the specific issues are? |
I’m very eager to contribute to this vulnerability-related issue. May I kindly ask what steps I should take to help resolve it? |
When you have time, I would greatly appreciate it if you could kindly let me know which parts contain errors so that I can make the necessary corrections. Thank you very much for your time and consideration. |
knock knock |
2 similar comments
knock knock |
knock knock |
Overview
This pull request simplifies the
secure_filename
function used in the/api/download
endpoint.The goal is to prevent basic path injection attacks while reducing unnecessary logic and improving maintainability.
Changes
Updated
secure_filename
function/
,\
) with spaces, then converts spaces to underscores.Modified
/api/download
endpointsecure_filename
function to sanitize thefile
query parameter.Motivation
Notes
Thank you for reviewing!