-
Notifications
You must be signed in to change notification settings - Fork 178
Add BAAS Admin API command to start/revert PBS->FLX server migration #6366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cb85651
to
cafa320
Compare
cafa320
to
f041764
Compare
The failure on Jenkins is due to receiving an empty changeset for the bootstrap data after the migration to FLX has occurred. I posted a note about this on the PBS FLX migration channel. |
CHANGELOG.md
Outdated
@@ -63,6 +63,7 @@ | |||
* Add CAPI test for Binding Callback Thread Observer. ([PR #6156](https://github.com/realm/realm-core/pull/6156)) | |||
* Implement MigrationStore to support migration from PBS to FLX ([PR #6324](https://github.com/realm/realm-core/pull/6324)) | |||
* Removed overloads of `Session::bind()` that allow binding a sync session to server other than the one configured in `Session::Config` ([PR #6358](https://github.com/realm/realm-core/pull/6358)). | |||
* Add admin api and test for performing the PBS->FLX migration and roll back on the server. (PR [#6366](https://github.com/realm/realm-core/pull/6366)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebase this because master was released
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - I did a rebase and it still put it in the wrong place - fixed
bool complete = false; | ||
}; | ||
|
||
MigrationStatus get_migration_status(const std::string& app_id) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this is called without a migration in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a migration is not currently active, it will return a MigrationStatus
with an empty status message and the isMigrated
value will reflect the current migration state.
@@ -119,6 +120,17 @@ class AdminAPISession { | |||
bool is_sync_terminated(const std::string& app_id) const; | |||
bool is_initial_sync_complete(const std::string& app_id) const; | |||
|
|||
struct MigrationStatus { | |||
std::string statusMessage; | |||
std::string errorMessage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should this be optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - made it optional
@@ -0,0 +1,200 @@ | |||
#include <catch2/catch_all.hpp> | |||
#include <chrono> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add this after the realm includes
#include <catch2/catch_all.hpp> | ||
#include <chrono> | ||
|
||
#include "flx_sync_harness.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would use full paths and angled brackets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the includes for the files in the src/ directory to use angled brackets. For the include files in the test/ directory, I updated them to use the appropriate path for the include file (similar to the other files in test/object-store/)
status = app_session.admin_api.get_migration_status(app_session.server_app_id); | ||
if (logger && last_status != status.statusMessage) { | ||
last_status = status.statusMessage; | ||
logger->debug("PBS->FLX Migration status: %1", last_status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also be waiting for the status of a rollback ("FLX->PBS....")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the PBS->FLX
string to be dynamic based on the operation being performed (i.e. PBS->FLX
if migrate_to_flx
is true, otherwise FLX->PBS
)
REQUIRE(switch_to_flx == status.isMigrated); | ||
if (logger) { | ||
if (!status.errorMessage.empty()) | ||
logger->debug("PBS->FLX Migration returned error: %1", status.errorMessage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
REQUIRE(status.errorMessage.empty()); | ||
} | ||
|
||
// Populates a FLXSyncTestHarness with the g_large_array_schema with objects that are large enough that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not accurate
trigger_server_migration(session.app_session(), false, logger_ptr); | ||
|
||
{ | ||
SyncTestFile pbs_config(session.app(), partition1, server_app_config.schema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should check partition2
too.
trigger_server_migration(session.app_session(), true, logger_ptr); | ||
|
||
{ | ||
SyncTestFile flx_config(session.app()->current_user(), server_app_config.schema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand this test. Shouldn't the user connect as PBS to trigger the migration on the client?
As it stands, it's a native flx user which connects and gets the pbs data (still valuable). In this case, it should create "normal" subscriptions then (a subscription on Dog class without the partition string).
Edit: I think I finally understand the test: it's how the client reconnects after the migration is triggered on the client (as flx and creates subscriptions using the partition key(s)). I still think we should have a test with a native flx user which didn't go through the migration (my second sentence above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is less about using the client migration feature and more about validating being able to trigger a migration on the server and connecting to it as PBS (in not migrated state) or FLX (in migrated state). The FLX subscriptions I created are representative of the subscriptions that will be automatically generated.
We will need to add a separate test to verify the client migration, once it is complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it
There is another failure, a build warning: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Kiro has a BAAS update https://github.com/10gen/baas/pull/8980 that will address the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits/questions.
|
||
using namespace realm; | ||
|
||
static void trigger_server_migration(const AppSession& app_session, bool switch_to_flx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we never call this with a non-null logger parameter, let's just remove the default nullptr value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal was to make it optional in case future tests didn't use a logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We def shouldn't do this in this PR, but I kinda think we should make it so there's always a logger available in tests. Like when you create the TestAppSession
or something you get a logger that's properly configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, similar to the logger in TestContext for the core-and sync-tests. I'll create a task and do that separately.
} | ||
|
||
// Add a set of count number of Dog objects to the realm | ||
static std::vector<ObjectId> fill_test_data(SyncTestFile& config, std::string partition, int start = 1, int count = 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we never call these without specifying the values for start/count, let's remove their default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
// Add some objects with the provided partition value | ||
for (int i = 0; i < count; i++, ++start) { | ||
auto id = ObjectId::gen(); | ||
auto obj = Object::create(c, realm, "Dog", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nit. Feel free to ignore me here 😄, but i kinda don't like re-using the "Dog" classes from the app integration tests. Can we name this a non-silly name like Object and have non-fake properties like "breed"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I was using the default schema that was included in default_app_config()
try { | ||
return progress_endpoint.get_json(); | ||
} | ||
catch (const std::exception& e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nit, but shouldn't we just let this exception get thrown so we get a stacktrace in the test logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to catch the case where the http status was not between 200-299, but it is actually a terminate, which can't be caught, so I'll put it back.
}(); | ||
auto errorMessage = progress_result["errorMessage"]; | ||
if (errorMessage.is_string() && !errorMessage.get<std::string>().empty()) { | ||
status.errorMessage.emplace(errorMessage.get<std::string>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not just throw here and on 735?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can change these to throw
|
||
using namespace realm; | ||
|
||
static void trigger_server_migration(const AppSession& app_session, bool switch_to_flx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We def shouldn't do this in this PR, but I kinda think we should make it so there's always a logger available in tests. Like when you create the TestAppSession
or something you get a logger that's properly configured.
What, How & Why?
Add a functions to the admin api test interface for:
Add a test to verify the operation and demonstrate/test the operation of the new admin api functions.
Fixes #6343
☑️ ToDos
[ ] C-API, if public C++ API changed.