-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Feat/index length #9748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/index length #9748
Conversation
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis update introduces new and refactored configuration files for database collection schemas, platform and project metadata, and the console project, enhancing structural clarity and maintainability. It also adds new environment variables, updates workflow automation, improves error handling, and expands runtime and OAuth provider support. Localization, documentation, Docker, and OpenAPI specs are updated for consistency and feature coverage. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant Resources
participant Database
participant Publisher
participant Executor
participant Telemetry
User->>CLI: Start CLI with task
CLI->>Resources: Initialize resources (dbForPlatform, console, getLogsDB, etc.)
CLI->>Database: Setup project/logs DB (with shared tables logic)
CLI->>Publisher: Initialize queues (functions, deletes, stats, etc.)
CLI->>Executor: Instantiate executor with host resolver
CLI->>Telemetry: Setup telemetry (NoTelemetry)
CLI->>CLI: Run task in Swoole coroutine
CLI->>Resources: On error, call enhanced logError (with detailed info)
CLI->>CLI: Clear timers on shutdown
Possibly related PRs
Suggested reviewers
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Security Scan Results for PRDocker Image Scan Results
Source Code Scan Results🎉 No vulnerabilities found! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 13
🔭 Outside diff range comments (1)
app/config/specs/open-api3-latest-client.json (1)
1-9714
:⚠️ Potential issueMissing
lengths
parameter forcreateIndex
The PR objectives describe adding alengths
array parameter to thecreateIndex
route, but this spec file does not include that change. Please add the newlengths
parameter under thecreateIndex
path—both in the request schema and in the parameter list—so the client spec matches the implemented API.🧰 Tools
🪛 Gitleaks (8.21.2)
8878-8878: Uncovered a JSON Web Token, which may lead to unauthorized access to web applications and sensitive user data.
(jwt)
🪛 Checkov (3.2.334)
[HIGH] 1-9751: Ensure that the global security field has rules defined
(CKV_OPENAPI_4)
[MEDIUM] 291-296: Ensure that arrays have a maximum number of items
(CKV_OPENAPI_21)
🧹 Nitpick comments (22)
app/config/locale/translations/kn.json (1)
248-248
: Ensure trailing newline at EOF
Including a newline at end-of-file is best practice for POSIX compliance.app/config/locale/translations/vi.json (1)
248-248
: Add newline at end-of-file
Ensure the file ends with a newline for POSIX compatibility.app/config/locale/translations/km.json (1)
248-248
: Ensure newline at EOF
Add a trailing newline to comply with POSIX standards.app/config/locale/translations/ro.json (1)
248-248
: Ensure trailing newline at EOF
Adding a newline at end-of-file helps maintain POSIX tool compatibility.app/config/locale/translations/eo.json (1)
9-9
: Extra Period in Verification Sign-off
The updated “thanks” value is"Dankegon.,"
, which introduces both a period and a comma. Consider removing the extra period for natural Esperanto punctuation, e.g.,"Dankegon,"
.app/config/locale/translations/zh-cn.json (2)
242-242
: Unintended Newline in OTP Greeting
The OTP greeting includes a literal newline ("\n"
) between punctuation marks:"你好,\n、"
. This likely deviates from the intended format—consider removing the newline.
246-246
: Double Punctuation in OTP Sign-off
The OTP “thanks” string"谢谢,、"
contains both a comma and the Chinese list comma. Please verify if both are required or remove the extra comma for clarity.app/config/collections/logs.php (1)
74-74
: Consider documenting the expected format of the lengths arrayThe array is correctly implemented but empty. For maintainability, consider adding a comment indicating the expected format (e.g.,
// Expected format: [255, 255]
) to help future developers understand how to populate this array.app/config/console.php (1)
22-29
: Consider adding support for multiple environmentsThe platforms array only includes 'Localhost'. For a more robust configuration, consider supporting additional environments (development, staging, production) conditionally based on environment variables.
CONTRIBUTING.md (1)
616-626
: Fix grammar issues in the preview domains section.There are a few grammar issues in the new section:
-Appwrite Functions are automatically given a domain you can visit to execute the function. This domain has format `[SOMETHING].functions.localhost` unless you changed `_APP_DOMAIN_FUNCTIONS` environment variable. This default value works great when running Appwrite locally, but it can be impossible to use preview domains with Cloud woekspaces such as Gitpod or GitHub Codespaces. +Appwrite Functions are automatically given a domain you can visit to execute the function. This domain has the format `[SOMETHING].functions.localhost` unless you've changed the `_APP_DOMAIN_FUNCTIONS` environment variable. This default value works great when running Appwrite locally, but it can be impossible to use preview domains with Cloud workspaces such as Gitpod or GitHub Codespaces. -To use preview domains on Cloud workspaces, you can visit hostname provided by them, and supply function's preview domain as URL parameter: +To use preview domains on Cloud workspaces, you can visit the hostname provided by them, and supply the function's preview domain as a URL parameter: -The path was set to `/ping` intentionally. Visiting `/` for preview domains might trigger Console background worker, and trigger redirect to Console without our preview URL param. Visiting different path ensures this doesnt happen. +The path was set to `/ping` intentionally. Visiting `/` for preview domains might trigger the Console background worker, and redirect to Console without our preview URL param. Visiting a different path ensures this doesn't happen.🧰 Tools
🪛 LanguageTool
[grammar] ~618-~618: It appears that the past participle should be used here.
Context: ...o execute the function. This domain has format[SOMETHING].functions.localhost
unles...(HAVE_PART_AGREEMENT)
[uncategorized] ~620-~620: Possible missing article found.
Context: ...ains on Cloud workspaces, you can visit hostname provided by them, and supply function's...(AI_HYDRA_LEO_MISSING_THE)
[uncategorized] ~620-~620: Possible missing article found.
Context: ...t hostname provided by them, and supply function's preview domain as URL parameter: ```...(AI_HYDRA_LEO_MISSING_THE)
🪛 markdownlint-cli2 (0.17.2)
622-622: Fenced code blocks should have a language specified
null(MD040, fenced-code-language)
app/config/collections/common.php (2)
1666-1670
: Class-name casing inconsistency (DATABASE
vsDatabase
)PHP class names are case-insensitive, so this won’t crash, but it does violate the project’s own style shown elsewhere in the file (
Database::METADATA
).
Keeping a single casing avoids confusion and helps static analysis tools.- '$collection' => ID::custom(DATABASE::METADATA), + '$collection' => ID::custom(Database::METADATA),(The same pattern repeats in
messages
,topics
,subscribers
, andtargets
blocks.)
Consider running a quick search-replace to stay consistent.
118-118
: Minor typo in comment (digitts
)- 16, // leading '+' and 15 digitts maximum by E.164 format + 16, // leading '+' and 15 digits maximum by E.164 formatapp/config/function-templates.php (1)
235-235
: Typo in the tagline.There's a typo in MongoDB Atlas's tagline: "geospecial" should be "geospatial".
- 'Realtime NoSQL document database with geospecial, graph, search, and vector suport.', + 'Realtime NoSQL document database with geospatial, graph, search, and vector support.',.github/workflows/tests.yml (1)
243-243
: Remove trailing whitespace.There's a trailing whitespace at the end of line 243.
- +🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 243-243: trailing spaces
(trailing-spaces)
app/config/collections/projects.php (1)
550-558
: Small style nit – prefer the constant over magic number
'lengths' => [64]
hard-codes the index prefix size, whereas elsewhere you rely onDatabase::LENGTH_KEY
or attribute-size values. Using the constant improves discoverability and prevents divergence if the canonical length ever changes.- 'lengths' => [64], + 'lengths' => [Database::LENGTH_KEY],app/cli.php (2)
127-139
:_APP_DATABASE_SHARED_TABLES
parsing duplicates & ignores whitespaceThe same explode/
in_array
logic is repeated twice and does not trim blanks, so
" hostA ,hostB"
won’t match. Centralise and normalise to avoid subtle bugs:-$sharedTables = \explode(',', System::getEnv('_APP_DATABASE_SHARED_TABLES', '')); +// Normalise once +$sharedTables = \array_filter( + \array_map('trim', \explode(',', System::getEnv('_APP_DATABASE_SHARED_TABLES', ''))) +);Consider extracting this into a small helper to DRY the two blocks.
Also applies to: 151-159
173-177
:$database
is not captured by reference – cache may reset each callInside
getLogsDB
the closure captures$database
by value. Although the
variable persists within the closure instance, re-assigning it does not
update the outer scope when the closure is rebuilt (e.g. during DI reloads).-return function (?Document $project = null) use ($pools, $cache, $database) { +return function (?Document $project = null) use ($pools, $cache, &$database) {Using
&$database
guarantees the single shared connection is reused as
intended.app/config/collections/platform.php (3)
224-234
: Clarify and cap thetemplates
column size before shipping
The hard-coded size1000000
is flagged with a TODO. Storing 1 MB in a singleVARCHAR
/TEXT
column can hurt index performance and may hit engine-specific limits (e.g. MySQLVARCHAR
max 65 535 in UTF-8). Please verify the actual upper-bound required for template payloads and:
- Use a dedicated constant instead of a magic number.
- Consider switching to
MEDIUMTEXT
/LONGTEXT
if multi-KB bodies are truly needed; otherwise tighten the length to the realistic maximum.
1645-1651
: EmptyvcsCommentLocks
schema – intentional?
The collection is declared with no attributes or indexes. If this is a placeholder, add a comment explaining its future purpose; otherwise remove to avoid deploying dead tables.
1-1652
: Consider splitting gigantic schema into domain-specific files
At 1 600+ lines this single config file is hard to navigate, review, and merge-conflict-prone. A common practice is one file per collection (or per bounded context) and an aggregator thatarray_merge
s them.This has no runtime cost but greatly improves maintainability.
app/config/specs/open-api3-1.6.x-client.json (2)
21-24
: Document the<REGION>
placeholder inservers
Addinghttps://<REGION>.cloud.appwrite.io/v1
is great. Consider expanding the description to explain valid<REGION>
values, fallback behavior, and how clients should substitute it.
3285-3287
: Avoid usingnull
forgroup
Thegroup
property is defined as a string in your metadata schema. Setting it tonull
may violate schema validation. If an endpoint doesn’t belong to a group, omit thegroup
field altogether.Also applies to: 3412-3413, 3543-3545, 3601-3603, 4089-4091, 4172-4173, 4263-4265, 5261-5262, 5313-5314, 5365-5366, 5417-5418, 5469-5470, 5519-5520, 5573-5574, 5625-5626
🛑 Comments failed to post (13)
app/config/locale/translations/sk.json (1)
10-10:
⚠️ Potential issueFix misplaced period before comma
The verification thanks line is"Ďakujeme.,"
, which combines a period and comma. It should be"Ďakujeme,"
without the period.Apply this diff:
- "emails.verification.thanks": "Ďakujeme.,", + "emails.verification.thanks": "Ďakujeme,",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements."emails.verification.thanks": "Ďakujeme,",
app/config/locale/translations/tl.json (1)
13-13:
⚠️ Potential issueRemove extraneous space before comma
The greeting string"Kamusta ,"
contains an extra space before the comma, which will render as"Kamusta ,"
instead of"Kamusta,"
.Apply this diff to correct it:
-"emails.magicSession.hello": "Kamusta ,", +"emails.magicSession.hello": "Kamusta,",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements."emails.magicSession.hello": "Kamusta,",
app/config/locale/countries.php (1)
119-119:
⚠️ Potential issueFix typo in Latvia's name
There appears to be a search/replace error in Latvia's name. "latitudevia" should be "Latvia".
- 'LV' => ['name' => 'latitudevia', 'latitude' => 57.0, 'longitude' => 25.0], + 'LV' => ['name' => 'Latvia', 'latitude' => 57.0, 'longitude' => 25.0],app/config/locale/translations/eo.json (1)
23-24:
⚠️ Potential issueIncorrect Invitation Greeting Translation
The invitation greeting is set to"Dankegon,"
instead of a greeting like “Saluton,”. This appears to be a copy/paste error—please correct the invitation hello line to an appropriate greeting.app/config/console.php (1)
44-45: 💡 Verification agent
🧩 Analysis chain
Verify whitelisting logic for security
The email and IP whitelisting implementation correctly parses comma-separated values, but consider adding input validation to ensure the values are properly formatted to prevent security issues.
🏁 Script executed:
#!/bin/bash # Check if there's any pattern validation for the whitelist values rg -A 3 -B 3 "WHITELIST_EMAILS|WHITELIST_IPS" --include="*.php"Length of output: 507
🏁 Script executed:
#!/bin/bash # Search for usage of whitelist config keys rg -n "authWhitelist" --glob '*.php' # Check for any email/IP validation using filter_var rg -n "filter_var" --glob '*.php' | grep -E "FILTER_VALIDATE_EMAIL|FILTER_VALIDATE_IP"Length of output: 2158
Validate whitelist entries before use
The console whitelist is currently parsed with
explode()
only, without trimming or validating each entry. To prevent misconfigurations or invalid values slipping through, please add proper validation:• In app/config/console.php, wrap the exploded values in
array_map('trim', …)
and then filter them viafilter_var()
usingFILTER_VALIDATE_EMAIL
andFILTER_VALIDATE_IP
.
• Example diff:- 'authWhitelistEmails' => (!empty(System::getEnv('_APP_CONSOLE_WHITELIST_EMAILS', null))) ? \explode(',', System::getEnv('_APP_CONSOLE_WHITELIST_EMAILS', null)) : [], - 'authWhitelistIPs' => (!empty(System::getEnv('_APP_CONSOLE_WHITELIST_IPS', null))) ? \explode(',', System::getEnv('_APP_CONSOLE_WHITELIST_IPS', null)) : [], + 'authWhitelistEmails' => array_filter( + array_map('trim', explode(',', System::getEnv('_APP_CONSOLE_WHITELIST_EMAILS', ''))), + fn(string $email) => filter_var($email, FILTER_VALIDATE_EMAIL) !== false + ), + 'authWhitelistIPs' => array_filter( + array_map('trim', explode(',', System::getEnv('_APP_CONSOLE_WHITELIST_IPS', ''))), + fn(string $ip) => filter_var($ip, FILTER_VALIDATE_IP) !== false + ),This ensures only well-formed emails and IPs are allowed.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.'authWhitelistEmails' => array_filter( array_map('trim', explode(',', System::getEnv('_APP_CONSOLE_WHITELIST_EMAILS', ''))), fn(string $email) => filter_var($email, FILTER_VALIDATE_EMAIL) !== false ), 'authWhitelistIPs' => array_filter( array_map('trim', explode(',', System::getEnv('_APP_CONSOLE_WHITELIST_IPS', ''))), fn(string $ip) => filter_var($ip, FILTER_VALIDATE_IP) !== false ),
app/config/collections/common.php (2)
1040-1043: 🛠️ Refactor suggestion
Index prefix length is far smaller than column size – please re-verify
providerUid
is declared withsize => 2048
, yet the unique index caps it to128
bytes (lengths => [ …, 128 ]
).
That is usually done to stay within MySQL’s 3072-byte limit, but it also means two different provider UIDs that differ only after the 128-byte prefix will collide and throw a “duplicate key” error.Double-check that 128 is sufficient for every provider UID you expect to store.
If the full length is truly required, alternative strategies are:
- Use a hash column (e.g. SHA-256) and index that.
- Switch the table/column charset to
ascii
/latin1
to reduce byte width.
2374-2377:
⚠️ Potential issue
$name
key will be silently ignored – should bename
All other collections use the plain
name
key, e.g.'name' => 'Cache'
.
Here the dollar-prefixed variant ('$name' => 'Files'
) breaks that convention and the runtime that consumes this config will most likely look forname
, resulting in the collection label being lost.- '$id' => ID::custom('files'), - '$name' => 'Files', + '$id' => ID::custom('files'), + 'name' => 'Files',app/config/collections/projects.php (2)
1630-1638: 🛠️ Refactor suggestion
Inconsistent collection/meta-ID conventions – pick one style and stick to it
Every other collection in this file wraps both$collection
and$id
withID::custom()
, e.g.'$collection' => ID::custom(Database::METADATA), '$id' => ID::custom('databases'),but the
variables
collection uses the raw constant / string:'$collection' => Database::METADATA, '$id' => 'variables',Besides hurting readability, this breaks the implicit guarantee that every identifier is namespaced through
ID::custom()
, and may cause collisions if another config file happens to reuse the same bare string.- '$collection' => Database::METADATA, - '$id' => 'variables', + '$collection' => ID::custom(Database::METADATA), + '$id' => ID::custom('variables'),
1144-1156:
⚠️ Potential issue
lengths
missing for large VARCHAR keys – could break on MySQL 8 with utf8mb4
With the newlengths
parameter now exposed by the API, omitting it for KEY indexes on wide‐string columns may push the underlying index past MySQL’s 3-byte / 767-byte limit (e.g. 255 × 3 B = 765 for utf8, but 4 × 255 = 1020 for utf8mb4).The following KEY indexes point to attributes sized
Database::LENGTH_KEY
(≈255) but have an emptylengths
array:
_key_resource
(resourceId
)_key_resource_type
(resourceType
)_key_resource_internal_id
(resourceInternalId
)_key_deployment_internal_id
(deploymentInternalId
)_key_function_internal_id
(functionInternalId
)_key_resource_internal_id_resource_type
(multi-col)Proposed fix (sample for one index; repeat for the others):
- 'lengths' => [], + 'lengths' => [Database::LENGTH_KEY],For the composite two-column index:
- 'lengths' => [], + 'lengths' => [Database::LENGTH_KEY, 100],Also applies to: 1186-1191, 1342-1347, 1620-1625, 1745-1750
app/cli.php (1)
301-301:
⚠️ Potential issueIncorrect use of
Swoole\Coroutine\run()
– will throw at runtime
run()
expects a callable, but\$cli->run(...)
returns the exit code (anint
).
Passing a non-callable value will raiseTypeError: Symfony\…\run() expects parameter 1 to be callable
.-run($cli->run(...)); +run(static function () use ($cli): void { + $cli->run(...); +});This wraps the CLI execution inside the coroutine as intended.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.run(static function () use ($cli): void { $cli->run(...); });
app/config/collections/platform.php (2)
983-992: 🛠️ Refactor suggestion
Add the missing
json
filter torealtime.value
The TODO indicates the column stores structured JSON, yet nojson
filter is applied. Without it, the SDK will not automatically encode/decode and may leak invalid data.- 'filters' => [], //TODO: use json filter + 'filters' => ['json'],📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.'$id' => ID::custom('value'), 'type' => Database::VAR_STRING, 'format' => '', 'size' => 16384, 'signed' => true, 'required' => true, 'default' => null, 'array' => false, - 'filters' => [], //TODO: use json filter + 'filters' => ['json'], ]
1255-1291:
⚠️ Potential issueToken fields likely too small after encryption
personalAccessToken
andpersonalRefreshToken
are limited to 256 chars but are encrypted + base64-encoded (encrypt
filter). Real-world GitHub/GitLab PATs are already ~40–120 chars; encrypted form frequently exceeds 300 chars.- 'size' => 256, + 'size' => 512,Adjust both attributes (lines 1260-1268 & 1282-1291) to prevent truncation.
app/config/specs/open-api3-1.6.x-client.json (1)
7620-7644:
⚠️ Potential issueRemove duplicate tag definitions
Thetags
array contains both"projects"
and"project"
with identical descriptions. This redundancy can lead to confusion and incorrect client code generation. Consolidate these into a single"projects"
tag.
Add a new lengths parameter to the createIndex route, and pass it through to the index that gets created by the Databases worker. It should allow an array of integers, and to be index matched against the attributes. E.g.
attributes: [title, name, releasedAt]
lengths: [255, 255, 255]
Test Plan
Added test testGetIndexByKeyWithLengths for creating index with lengths and fetching them and comparing them
Related PRs and Issues
Checklist
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Chores