8000 metamorphic: remote storage fixes for crossversion by RaduBerinde · Pull Request #4760 · cockroachdb/pebble · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

metamorphic: remote storage fixes for crossversion #4760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

RaduBerinde
Copy link
Member

These commits fix various aspects around external and shared storage in the context of the crossversion test, where we start with an initial state.

Informs #4732

metamorphic: use FS-based remote storage

The metamorphic test uses in-mem remote storage, which doesn't work
when starting with an initial state (as in the crossversion tests).

Note that there is code to save the remote storage contents to disk
when the store is in-memory, but they are not read back when used as
initial state.

This commit changes to using FS-based remote storage in shared
and external subdirs inside the data dir. Note that the store itself
can still be in-memory; the data gets saved automatically with the
store.

We improve the FS-based Storage implementation to sync data and list
objects. We can now allow simulating crashes in the metamorphic test
when shared storage is enabled.

metamorphic: support CreateOnShared on existing store

If the metamorphic test has CreateOnShared set to something other
than "none" and we start with an initial store which did not have
remote storage configured, there can be background errors right after
opening the store, before we get a chance to call SetCreatorID().
The metamorphic test fails on these background errors.

To fix this, we always open with CreateOnSharedNone first, and if
necessary reopen after setting the creator ID.

metamorphic: add random number to external object names

External object names can collide with existing objects when starting
with an initial state. This change adds a random unique number to the
filenames.

The metamorphic test uses in-mem remote storage, which doesn't work
when starting with an initial state (as in the crossversion tests).

Note that there is code to save the remote storage contents to disk
when the store is in-memory, but they are not read back when used as
initial state.

This commit changes to using FS-based remote storage in `shared`
and `external` subdirs inside the data dir. Note that the store itself
can still be in-memory; the data gets saved automatically with the
store.

We improve the FS-based Storage implementation to sync data and list
objects. We can now allow simulating crashes in the metamorphic test
when shared storage is enabled.
@RaduBerinde RaduBerinde requested a review from a team as a code owner May 22, 2025 18:35
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator
@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @sumeerbhola)


-- commits line 26 at r2:
I didn't follow this. Is this referring to background errors like from compactions which are attempting to write to shared storage before the creator ID has been set? Would it be simpler if we have CreateOnShared != none to first open just the object provider and assign a creator ID to the object provider? Bc in the context of the metamorphic test we don't have the bootstrapping problem to allocate a creator ID


metamorphic/ops.go line 1887 at r3 (raw file):

	//
	// Note that the number is not based on the seed, in case we run using the
	// same seed that was used in a previous run with the same store.

if we made the name based on the generation time seed, wouldn't the current state of the prng not be the same because the previous run's different operations? Or is this referring to previous runs of the same ops? In which case can't we reset the test state to isolate identical re-runs?


metamorphic/test.go line 198 at r2 (raw file):

		})
		if err != nil {
			fmt.Printf("Open(): %v\n", err)

nit: debug detritus?


metamorphic/test.go line 207 at r2 (raw file):

			})
			if err != nil {
				fmt.Printf("SetCreatorID(): %v\n", err)

nit: debug detritus?

Copy link
Member Author
@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @jbowens and @sumeerbhola)


-- commits line 26 at r2:

Previously, jbowens (Jackson Owens) wrote…

I didn't follow this. Is this referring to background errors like from compactions which are attempting to write to shared storage before the creator ID has been set? Would it be simpler if we have CreateOnShared != none to first open just the object provider and assign a creator ID to the object provider? Bc in the context of the metamorphic test we don't have the bootstrapping problem to allocate a creator ID

Right, flushes or compactions fail. That would work.. but I figure better to test the external facing APIs. I'll try and see how it looks.


metamorphic/ops.go line 1887 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

if we made the name based on the generation time seed, wouldn't the current state of the prng not be the same because the previous run's different operations? Or is this referring to previous runs of the same ops? In which case can't we reset the test state to isolate identical re-runs?

I was thinking of the case where we manually set the seed and first run on e.g. 25.2 and then run again on master using initial-state.

Perhaps a cleaner alternative is to check the list of existing files.


metamorphic/test.go line 198 at r2 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: debug detritus?

I think it's helpful to keep

@RaduBerinde RaduBerinde force-pushed the metamorphic-crossversion-remote-storage branch 2 times, most recently from 08e312d to c817ef8 Compare May 24, 2025 02:18
If the metamorphic test has `CreateOnShared` set to something other
than "none" and we start with an initial store which did not have
remote storage configured, there can be background errors right after
opening the store, before we get a chance to call `SetCreatorID()`.
The metamorphic test fails on these background errors.

To fix this, we always open with `CreateOnSharedNone` first, and if
necessary reopen after setting the creator ID.
External object names can collide with existing objects when starting
with an initial state. This change adds a random unique number to the
filenames.
@RaduBerinde RaduBerinde force-pushed the metamorphic-crossversion-remote-storage branch from c817ef8 to 5ca45a8 Compare May 24, 2025 02:20
@RaduBerinde RaduBerinde merged commit bd64861 into cockroachdb:master May 27, 2025
6 checks passed
@RaduBerinde RaduBerinde deleted the metamorphic-crossversion-remote-storage branch May 27, 2025 14:34
@RaduBerinde
Copy link
Member Author

Upon backporting, it looks that the first commit breaks the TestMetaTwoinstance variant, which makes sense because the "shared" storage is no longer shared. BUT the fact that this test doesn't fail on master means something is broken and it's not actually testing much :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0