8000 perf: "random" naming to improve concurrency and locality (backport #30053) by mergify[bot] · Pull Request #31569 · frappe/frappe · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

perf: "random" naming to improve concurrency and locality (backport #30053) #31569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 7, 2025

Conversation

mergify[bot]
Copy link
Contributor
@mergify mergify bot commented Mar 7, 2025

This feels over-engineered and it kinda is, but other efforts to
introduce sequential naming/UUID naming haven't been that fruitful
either.

10 character random "hash" is now changed to:

  1. first character - last character in UUID4 ID of request/job
  2. three characters - derived from current timestamp.
  3. 6 characters - random data.

This satisfies all three requirements:

  1. Readers - temporal locality should result in spatial locality on disk. (fewer pages accessed)
  2. Single writer - temporal locality should result in spatial locality. (fewer dirty pages)
  3. Multiple writers + updater - temporal locality should NOT result in spatial locality @ suprenum (less lock contention)

Mostly concludes #25309 and #28349

Rough probability numbers

Assumptions:

  • Unique per worker prefix - 16 (uuid's base16 version)
  • Rough time spent generating names - 10% of request (very very conservative estimate)

Probability(collision) = P(at least one prefix collision) * P(time collision)
Probability(collision) = (1 - p(all different)) * 10%
Probability(collision) = (1 - (16! / 16-N! ) / 16^N ) * 10%

N (concurrency) Probability(collision)
1 0.0%
2 0.6%
3 1.8%
4 3.3%
5 5.0%
6 6.6%
7 7.9%
8 8.8%

This is an automatic backport of pull request #30053 done by [Mergify](https://mergify.com).

This feels overengineered and it kinda is, but other efforts to
inroduce sequential naming/UUID naming haven't been that fruitful
either.

10 character random "hash" i now changed to.

1. first character - last character in UUID4 ID of request/job
2. three characters - derived from current timestamp.
4. 6 characters - random data.

This satisfies all three requirements:

1. Readers - temporal locality should result in spatial locality on disk. (fewer pages accessed)
2. Single writer - temporal locality should result in spatial locality. (fewer dirty pages)
3. Multiple writers - temporal locality should NOT result in spatial locality. (less lock contention)

Mostly concludes #25309 and #28349

Rough probabiliy numbers

Assumptions:
- Unique per worker prefix - 16 (uuid's base16 version)
- Rough time spent generating names - 10% of request (very very conservative estimate)

Probability(collision) = P(at least one prefix collision) * P(time collision)
Probability(collision) = (1 - p(all different)) * 10%
Probability(collision) = (1 - (16! / 16-N! )/ 16^N ) * 10%

| N (concurrency) | Probability(collision) |
| 1  |    0.0% |
| 2  |    0.6% |
| 3  |    1.8% |
| 4  |    3.3% |
| 5  |    5.0% |
| 6  |    6.6% |
| 7  |    7.9% |
| 8  |    8.8% |

(cherry picked from commit 9b79dfe)

# Conflicts:
#	frappe/monitor.py
#	frappe/tests/test_naming.py
@mergify mergify bot added the conflicts label Mar 7, 2025
@mergify mergify bot assigned ankush Mar 7, 2025
Copy link
Contributor Author
mergify bot commented Mar 7, 2025

Cherry-pick of 9b79dfe has failed:

On branch mergify/bp/version-15-hotfix/pr-30053
Your branch is up to date with 'origin/version-15-hotfix'.

You are currently cherry-picking commit 9b79dfeb7b.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   frappe/model/naming.py

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   frappe/monitor.py
	both modified:   frappe/tests/test_naming.py

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@ankush ankush closed this Mar 7, 2025
@ankush ankush reopened this Mar 7, 2025
@ankush ankush enabled auto-merge (squash) March 7, 2025 09:38
@ankush ankush merged commit e6b7198 into version-15-hotfix Mar 7, 2025
16 checks passed
@ankush ankush deleted the mergify/bp/version-15-hotfix/pr-30053 branch March 7, 2025 09:52
frappe-pr-bot pushed a commit that referenced this pull request Mar 11, 2025
# [15.58.0](v15.57.2...v15.58.0) (2025-03-11)

### Bug Fixes

* add all formating fields inside one function ([5203f5d](5203f5d))
* add check to see all variants of english ([4700500](4700500))
* add currency precision formatting while exporting report ([adc5817](adc5817))
* Arabic translations ([8d10f69](8d10f69))
* Bosnian translations ([4b3ced1](4b3ced1))
* change condition ([2391415](2391415))
* change if to elif ([6dc467a](6dc467a))
* check permission on new doc ([#31626](#31626)) ([#31628](#31628)) ([3d00fde](3d00fde))
* check properly for blacklisted function usage ([a5e6152](a5e6152))
* Chinese Simplified translations ([b112e13](b112e13))
* correct data while exporting with translate values ([03435dd](03435dd))
* correct permission query condition in Dashboard ([37639ba](37639ba))
* Croatian translations ([93f2a0d](93f2a0d))
* Croatian translations ([422a720](422a720))
* Croatian translations ([cb93873](cb93873))
* Croatian translations ([f81e37c](f81e37c))
* custom column export issue in report ([31bba3d](31bba3d))
* **db_query:** improve regex ([e073f0e](e073f0e))
* do not allow renaming if autoname & title_field is same ([5c636f4](5c636f4))
* **DX:** Limit cprofiler output to 200 lines ([#31538](#31538)) ([#31545](#31545)) ([4e468ad](4e468ad))
* **email_account:** make attachments public by default ([0b445e7](0b445e7))
* Esperanto translations ([58f0291](58f0291))
* French translations ([77ebeae](77ebeae))
* German translations ([7dbd3ab](7dbd3ab))
* **get_url:** allow disabling host header override (backport [#31522](#31522)) ([#31574](#31574)) ([52e3133](52e3133))
* Hungarian translations ([3b534e3](3b534e3))
* if autoname is same as title then show only one field in rename modal ([5e71966](5e71966))
* **list_view:** use more filter type values to set value on new entry ([6f7cdf1](6f7cdf1))
* Merge conflicts + faulty html tag check condition ([8e62205](8e62205))
* Move on_session_creation hook after session is created ([995961f](995961f))
* override sanitized column name (backport [#31576](#31576)) ([#31580](#31580)) ([996c852](996c852))
* Persian translations ([0964fb5](0964fb5))
* Persian translations ([94fc926](94fc926))
* Polish translations ([1576db5](1576db5))
* Polish translations ([4582e9f](4582e9f))
* Portuguese, Brazilian translations ([e8d8836](e8d8836))
* Portuguese, Brazilian translations ([a17bffb](a17bffb))
* Run on_session_creation on OAuth logins ([92ec249](92ec249))
* Russian translations ([33bc296](33bc296))
* search instead of match ([#31557](#31557)) ([#31571](#31571)) ([753a9f4](753a9f4))
* set report name as file name ([7648b9b](7648b9b))
* small change ([24470c0](24470c0))
* Spanish translations ([74b0aa1](74b0aa1))
* Swedish translations ([0671b51](0671b51))
* sync translations from crowdin ([#31125](#31125)) ([f0ff0ea](f0ff0ea))
* Thai translations ([9aecfc1](9aecfc1))
* Turkish translations ([2d3c66d](2d3c66d))
* update sequence ([2677f74](2677f74))
* Use `is_html` instead of regex ([799a1bb](799a1bb))
* **UX:** Show reason for read only form in headline (backport [#31511](#31511)) ([#31515](#31515)) ([35b2333](35b2333))
* validate data.filter is empty or not ([5e05e57](5e05e57))

### Features

* Add Grid Page Length field for child tables in DocType ([2aeed92](2aeed92))
* add login via fc button in login page ([#31541](#31541)) ([#31546](#31546)) ([cb699fe](cb699fe))
* analytics on Prepared Reports ([53ad1fc](53ad1fc))
* make translate report data configurable ([db47272](db47272))

### Performance Improvements

* "random" naming to improve concurrency and locality (backport [#30053](#30053)) ([#31569](#31569)) ([e6b7198](e6b7198))
* Avoid parsing same field repeatedly (backport [#29030](#29030)) ([#31565](#31565)) ([6310b07](6310b07))
* Don't query redirects on existing session (backport [#28981](#28981)) ([#31567](#31567)) ([2dd31eb](2dd31eb))
* faster add_to_date ([#28843](#28843)) ([#31562](#31562)) ([c211ac8](c211ac8))
* fetch existing fields beforehand ([6d6674e](6d6674e))
* No need to set expiry for rate limiter key everytime ([#28956](#28956)) ([#31572](#31572)) ([df65922](df65922))
* queue auto email report separately ([642d196](642d196))
* restrict doctypes to update when creating custom fields ([7929a04](7929a04))
* Skip updating defaults when nothing has changed ([#29036](#29036)) ([#31568](#31568)) ([e203d1e](e203d1e))
* speed up `flt` by 1.06x and `get_system_settings` by 1.32x (backport [#28841](#28841)) ([#31560](#31560)) ([ae56612](ae56612))
* speed up recurring redis cache accesses ([#28805](#28805)) ([#31558](#31558)) ([402e484](402e484))
* speedup `frappe.call` by ~8x ([#28866](#28866)) ([#31563](#31563)) ([8a07e06](8a07e06))
* speedup `get_datetime` parsing by ~9.5x (backport [#28840](#28840)) ([#31561](#31561)) ([3c3cbd6](3c3cbd6))
* Speedup `get_doc` by another ~1.5x ([#28807](#28807)) ([#31559](#31559)) ([d75a1e0](d75a1e0))
* speedup QB field sanitization ([#28818](#28818)) ([#31529](#31529)) ([6e69c13](6e69c13))
* speedup rate limiter by ~1.2x ([#28920](#28920)) ([#31564](#31564)) ([cf26e44](cf26e44))
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0