8000 Fix ULID bugs when generating randomness for multiple requests coming in at the exact same timestamp by ByronBecker · Pull Request #2 · aviate-labs/ulid.mo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix ULID bugs when generating randomness for multiple requests coming in at the exact same timestamp #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 5, 2022

Conversation

ByronBecker
Copy link
Contributor
@ByronBecker ByronBecker commented May 3, 2022

Tagging the maintainer @di-wu and ICDevs @skilesare for potential impacted library dependencies

Initial Issue

Making at least 2-3 requests that call the Source new() method at roughly the same time hit an overflow error. This was due to a bug in the following conditional statement.

public func add(n80 : Nat80, n : Nat64) : Result.Result<Nat80, Text> {
  var l = n80.low +% n;
  var h = n80.high;
  // l is always greater than n80.low unless the random number being added causes  overflow
  // so we are always incrementing h
  if (n80.low < l) h +%= 1;  
  // here h will pretty much always be greater than n80.high after a single execution, so an "overflow error" will be thrown
  if (n80.high < h) return #err("overflow");
  #ok({ low = l; high = h; });
};

Source of overflow error

Subsequent Issue

Upon fixing this bug, I encountered an uncaught index out of bounds error Where index out of bounds error was throwing

This was due to the fact that the Array.append() method was removed in a subsequent commit and swapped for tabulate, and the size of the array (6) was less than the 10 elements necessary for the random portion of the ulid generation. Source of index out of bounds error.

Fixes Made

  • Correct conditional in Nat80.add() to only give an overflow error when incrementing the h value by 1 causes it to overflow
  • Correct Nat80.toArray() to return the full concatenated array from lower and upper Nat80 thresholds, as Source.new() expects a full 16 elements. Note that this is triggered by #ok(Nat80.toArray(e)) in the monotomicRead() method

…e same timestamp

* Correct conditional in Nat80.add() to only give an overflow error when incrementing the h value by 1 causes it to overflow
* Correct Nat80.toArray() to return the full concatenated array from lower and upper Nat80 thresholds, as Source.new() expects a full 16 elements. Note that this is triggered by #ok(Nat80.toArray(e)) in the monotomicRead() method
@skilesare
Copy link

Great find! @di-wu Can you validate ane merge?

@q-uint q-uint self-requested a review May 3, 2022 15:08
Copy link
Contributor
@q-uint q-uint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 6 vs 10
    Not sure how this got passed the tests I did back in the day.
    80 = 10 * 8 != 6 * 8...

  2. I'll go over what should happen here...

Since the lower part number is the same bitlength as the given number to add. We know that if we add them together we get either, a number bigger than before, or smaller.
e.g. (nat8, max: 255)

var n = 124;
n +%= 121; // OK
// n == 255
n +%= 1; // Overflow
// n == 0

Which means if the new lower part is smaller than the original, we need to add 1 to the higher number.

Then in turn we can apply the same logic to the higher part of the uint80 number.

I used to following tests to check the PR:

$ $(vessel bin)/moc $(vessel sources) -r test/Nat80.test.mo

import Nat80 "../src/Nat80";

let MAX_NAT16 : Nat16 = 65_535;
let MAX_NAT64 : Nat64 = 18_446_744_073_709_551_615;

var n = Nat80.ZERO;

func add(x : Nat64) = switch (Nat80.add(n, x)) {
    case (#ok(m)) n := m;
    case (#err(_)) assert(false);
};

assert(Nat80.toArray(n) == [
    0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00,
]);
assert(Nat80.isZero(n));

add(MAX_NAT64);
assert(Nat80.toArray(n) == [
    0x00, 0x00, 0xFF, 0xFF, 0xFF,
    0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
]);

add(1);
assert(Nat80.toArray(n) == [
    0x00, 0x01, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00,
]);

add(MAX_NAT64);
assert(Nat80.toArray(n) == [
    0x00, 0x01, 0xFF, 0xFF, 0xFF,
    0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
]);

n := { low = MAX_NAT64; high = MAX_NAT16 };
assert(Nat80.toArray(n) == [
    0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
    0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
]);
assert(not Nat80.isZero(n));

n := Nat80.addWrap(n, 1);
assert(Nat80.isZero(n));

n := { low = MAX_NAT64; high = MAX_NAT16 };
assert(Nat80.add(n, 1) == #err("overflow"));
public func add(n80 : Nat80, n : Nat64) : Result.Result<Nat80, Text> {
	let m = addWrap(n80, n);
	if (m.high < n80.high) return #err("overflow");
	#ok(m);
};

public func addWrap(n80 : Nat80, n : Nat64) : Nat80 {
	var l = n80.low +% n;
	var h = n80.high;
	if (l < n80.low) h +%= 1;
	{ low = l; high = h; };
};

src/Nat80.mo Outdated
Comment on lines 36 to 37
if (n80.low < l) h +%= 1;
if (n80.high < h) return #err("overflow");
if (n80.high > h) return #err("overflow");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like both brackets need to be inverted.

if (l < n80.low) h +%= 1;
if (h < n80.high) return #err("overflow");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point, I will test that out just to make sure and then push up a fix

@ByronBecker
Copy link
Contributor Author
ByronBecker commented May 4, 2022

@di-wu
So before fixing the lower part overflow, I was getting ULIDs that were much "tighter" together (for the same timestamp)

I've grouped them together by those with the exact same mtimestamp to show you what I mean

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPAHSSNQQ43Z73SA1FH71F
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPAHSSNQQM3Z73SA1FH71G
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPAHSSNQR43Z73SA1FH71H

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPYYFEYMWGK98BWV62JEDJ
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPYYFEYMX0K98BWV62JEDK
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPYYFEYMXGK98BWV62JEDM
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DPYYFEYMY0K98BWV62JEDN

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DQKDJ1YD2E3C9JTM98SSSA
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DQKDJ1YD2Y3C9JTM98SSSB
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DQKDJ1YD3E3C9JTM98SSSC
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DQKDJ1YD3Y3C9JTM98SSSD

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DZ6RDGDE6KB2XB1RZJ1P8J
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DZ6RDGDE73B2XB1RZJ1P8K
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DZ6RDGDE7KB2XB1RZJ1P8M
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DZ6RDGDE83B2XB1RZJ1P8N
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] createUserDeposit generated ULID: 6Y53DZ6RDGDE8KB2XB1RZJ1P8P

Now that the lower part overflow is fixed, I'm seeing much more randomness in the random portion of the ULID

Grouping by the timestamp once more

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y5907VF7W34TXG4NSZ0S7EDCK
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y5907VF7W34TXG4NSZ182DNG7

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y5908FZ5Z6K996154J4ACEGFV
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y5908FZ5Z6K996154J7QC47E7
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y5908FZ5Z6K996154J8AFDTDB

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596DJDC6K3NGSJ17411FQ6B3
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596DJDC6K3NGSJ1744HT878N

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596E6XKXF5HSPSWE7VPGZEMM
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596E6XKXF5HSPSWE7XTYJ5Q3
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596E6XKXF5HSPSWE814407SX
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596E6XKXF5HSPSWE821GEY9F

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596EVAX978AEWV0JSN220YYG
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596EVAX978AEWV0JSRY6TD32
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596EVAX978AEWV0JSWCTRFE6
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596EVAX978AEWV0JSZV89397
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596EVAX978AEWV0JT39B3SG2

[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596FFMHXBF0AVRDR5TT6T3AG
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596FFMHXBF0AVRDR5W1MWY4M
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596FFMHXBF0AVRDR5YW1MFQ7
[Canister qjdve-lqaaa-aaaaa-aaaeq-cai] ULID transactionId = 6Y596FFMHXBF0AVRDR60TQ7EFZ

I like the monotomic randomness introduced, but I'm wondering if this would lead to reaching the 2^80 conflict overflow sooner?

Edit:

Ah, I think I understand what is happening now. The random() function being called in Source.mo is now actually having an effect, and so the random portion of the ULID is being incremented at a much quicker rate than just by +1, meaning we're probably looking at something like 2^(70-something) potential collisions before an overflow.

I'm fine with this, but do we need the randomness on conflicts? What are some of the benefits/drawbacks? If you're set on keeping the random() for incrementing I won't push back.

@q-uint
Copy link
Contributor
q-uint commented May 4, 2022

The reason I used random is so the next ULID can not be guessed by an end user.
This is not that important in databases tho...

@ByronBecker
Copy link
Contributor Author
ByronBecker commented May 4, 2022

Ok, pushed up the fix for when the lower part of the ulid overflows. Ready for squash merge

@di-wu

@q-uint q-uint merged commit 4e1c910 into aviate-labs:main May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0