8000 Discussion: What should be the correct behavior on hibernating browser tabs? · Issue #6810 · pubkey/rxdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Discussion: What should be the correct behavior on hibernating browser tabs? #6810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open 8000
pubkey opened this issue Feb 1, 2025 · 23 comments
Open

Comments

@pubkey
Copy link
Owner
pubkey commented Feb 1, 2025

Context

Browser can "hibernate" browser tabs when they are not active. This ensures these tabs do not use any memory or CPU, the JavaScript process on these hibernated tabs is then stopped.

RxDB uses a leader-election for replication so that when many tabs are open, only one tab runs the replication. This saves a lot CPU power and bandwith.

People often reported that they had problems when the elected leader-tab became hibernated by the browser. This seemingly stopped the replication and caused problems.

To prevent this, RxDB added a hack to prevent tab hibernation if at least one replication is running: c2c7ea4
But this hack is not good as it stops the device from power saving when the tab is not in use.

Goal

I will remove this hack in the next release. From my testings, the default behavior of RxDB with hibernating tabs is correct: The hibernated tab dies and a new leader is elected. You can test this behavior with the broadcast channel test-page where the leading tab has a crown-icon in the title. You can manually send tabs to hibernation in chrome at chrome://discards/.

Question

Do you have any way to reproduce a case where the default behavior causes problem?
What do you think the default behavior of RxDB should be?

Related discussions:

@andreuka
Copy link
andreuka commented Feb 1, 2025

This issue is especially affects mobile devices.

I think need the mechanism to switch leader tab to latest active tab, since the latest active tab will have most longer lifecycle until its suspended and after user get backs to website, the new opened tab becomes leader again and this will make it not noticeable for end user and fix the issue.

@pubkey
Copy link
Owner Author
pubkey commented Feb 1, 2025

@andreuka But on mobile, if a tab is hibernated, doesnt it also elect a new tab as leader? Because in my testings it does.

@andreuka
Copy link
andreuka commented Feb 1, 2025

@andreuka But on mobile, if a tab is hibernated, doesnt it also elect a new tab as leader? Because in my testings it does.

I mostly testing at iOS and from my experience its not hibernating its immediately, its firstly put it in kind of slow mode and I had alot of complains from users that the application does not works well at mobile devices.

I do not using replication, but I had noticed that by my custom WebSocket connection which opened in both tabs and pushs same events to all tabs, but if only leader tab writes data to database and when tabs are switched, the leader was not changed, so I made a hack, that on mobile every tab with active WebSocket connection writing the data to the database and that works enough well, since only 1 tab will work fast.

That is my function to determine if tab should write data to database or not:

const elector = createLeaderElection(channel);

export async function isLeader(){
    if( isIOS() ){
        return true;
    }

    if( elector && ! await elector.hasLeader() ) return true;
    if( elector && await elector.hasLeader() && elector.isLeader ) return true;

    return false;
}

but thats not perfect solution as you can see and I hope that can be done other way.

@paul-geisler
Copy link

The "hack" is already there. I can think of two options.

  1. Keept the hack, make it optional as another Plugin, maintain the hack
import { RxDBLeaderKeepAlivePlugin } from 'rxdb/plugins/leader-election';
addRxPlugin(RxDBLeaderKeepAlivePlugin);

...or RxDBProtectTheLeaderPlugin

  1. Remove the hack and add a piece of information about the browser behavior and potential risks to the guide/manual with the instructions on how to implement the hack

@pubkey
Copy link
Owner Author
pubkey commented Feb 4, 2025

Hello everyone. Just released version 16.5.0. This version contains the toggleOnDocumentVisible flag on the replication states. When toggleOnDocumentVisible is set, the replication will always run in leader AND the currently vissible tab.

I also added some tests to ensure that the replication works without problems when running from two tabs at the same time. Please test this, when it works for everyone, I will remove the previous click-event-hack and also make toggleOnDocumentVisible=true by default.

pubkey added a commit that referenced this issue Apr 1, 2025
@pubkey
Copy link
Owner Author
pubkey commented Apr 1, 2025

Closing this. On the next major version, toggleOnDocumentVisible will be true by default. Discussion still welcomed if you have any ideas on how to improve.

@pubkey pubkey closed this as completed Apr 1, 2025
@space7panda
Copy link

Closing this. On the next major version, toggleOnDocumentVisible will be true by default. Discussion still welcomed if you have any ideas on how to improve.

@pubkey Hello, thank you for adding toggleOnDocumentVisible. It fixed Socket closed Error for us but unfortunately, we faced some new issues.

Here is a list of devices we tested:

Device RAM Type Storage Type
Xiaomi 14 LPDDR5X UFS 4.0
OnePlus 8 LPDDR4X UFS 3.0
Xiaomi Redmi Note 10 Pro LPDDR4X UFS 2.2
Google Pixel 4a LPDDR4X UFS 2.1
Xiaomi A1 (Mi A1) LPDDR3 eMMC 5.1
Tecno KI5k SPARK 10C LPDDR4X eMMC 5.1
Redmi 13 (Redmi 14C) LPDDR4X eMMC 5.1
Tecno Spark 10 LPDDR4X eMMC 5.1
Samsung Galaxy A11 LPDDR3 eMMC 5.1

For the last few weeks, we’ve had a hard time with our bulkInsert and incrementalModify operations. The delay for these operations has skyrocketed from 200ms to 6000ms.

The first thing we discovered is that all devices using eMMC storage types are facing this issue. eMMC is known for having low random write speeds, which are crucial for SQLite. All phones that used UFS were not affected. After a long debugging session, we figured out that toggleOnDocumentVisible has been causing all of this. We haven’t checked the source code of RxDB yet to understand how it can affect performance so badly, but we’re certain that it is responsible for the additional load on devices, which is very noticeable on low-budget phones.

@pubkey
Copy link
Owner Author
pubkey commented Apr 7, 2025

@space7panda It think it is very unlikely that this comes from toggleOnDocumentVisible and based on the given information I do not think it is possible to reproduce that problem for me.

Maybe you have a custom conflict handler which causes infinite write loops?

@space7panda
Copy link
space7panda commented Apr 7, 2025

@space7panda It think it is very unlikely that this comes from toggleOnDocumentVisible and based on the given information I do not think it is possible to reproduce that problem for me.

Maybe you have a custom conflict handler which causes infinite write loops?

Thanks, we can double check that but im not sure how conflict Handler is related to our testing where we had 2 separate apk files where:
toggleOnDocumentVisible: false
toggleOnDocumentVisible: true

And also as additional testing we downgraded our RxDb to 15.39.0 and everything worked fine there because toggleOnDocumentVisible hasn't been implemented yet

@pubkey
Copy link
Owner Author
pubkey commented Apr 7, 2025

You could use the logger plugin to check what is going on in your storage. Likeky some other writes block your bulk-inserts so they are slow waiting to open a transaction.

@space7panda
Copy link

You could use the logger plugin to check what is going on in your storage. Likeky some other writes block your bulk-inserts so they are slow waiting to open a transaction.

Sure we will check that and elaborate on results

@KingSora
Copy link
Contributor
KingSora commented Apr 18, 2025

@pubkey I can confirm that toggleOnDocumentVisible causes problems with bulk writes. - I've recently discovered this because rxdb didn't resolve the db.addCollections promise at all on safari iOS. It precisely is hanging here.

At first I was thinking this issue is related to something else, but after a very long debugging session I concluded it must be the toggleOnDocumentVisible option, since I've rolled back to v15 (bug absend), went back to 16.0.0 (bug absent), went all the way to 16.11.0 without toggleOnDocumentVisible (bug absent). After enabling toggleOnDocumentVisible the issue happens consitently on BrowserStack Safari iOS.

In our case the app is stuck in a never ending loading cycle and will stay stuck until safari is closed (or all tabs of the app are closed)

My suspicion is that this is related to the replicationState and that start and pause are called and not awaited, so the execution order is:

// important is that none of the functions are actually awaited even though they are async
replicationState.start();
replicationState.pause();
replicationState.start();

@KingSora
Copy link
Contributor
KingSora commented Apr 19, 2025

@pubkey as a followup I've debugged the whole thing with the logger plugin and those are the operations which aren't finishing:

bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(1) instance:yneqilqpvj_opId:nwevzxjq,bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(1) instance:yneqilqpvj_opId:nwevzxjq

bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(6) instance:yneqilqpvj_opId:ntnxzpsd,bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(6) instance:yneqilqpvj_opId:ntnxzpsd

My bug is also NOT happening with the localstorage, dexie and memory storage. - We are using the indexeddb storage atm.

The dexie storage gives some errors which the indexeddb doesn't give, might be related:

RxDB.dexie.dbname.rx-replication-meta-0598bd57c35076cd146223db42020d0e71cedf64117e5ca054c60674c7af053d.findDocumentsById(1) instance:yyvdvdohtj_opId:yzmsworl: ERROR: DatabaseClosedError

RxDB.dexie.dbname.rx-replication-meta-e547ce34efcc417fc7e2b633d5b2c6432a24131a21395b0c961e73b3833688ac.findDocumentsById(1) instance:yyvdvdohtj_opId:ejxlvkom: ERROR: DatabaseClosedError

RxDB.dexie.dbname.rx-replication-meta-49a45dd831c7fdd1c2960ba31ce678837216f5da7d7be055a2efb24840097009.findDocumentsById(1) instance:yyvdvdohtj_opId:txrjajph: ERROR: DatabaseClosedError

I'm happy to give further data / assist you in fixing this issue.

@pubkey
Copy link
Owner Author
pubkey commented Apr 23, 2025

@pubkey I can confirm that toggleOnDocumentVisible causes problems with bulk writes. - I've recently discovered this because rxdb didn't resolve the db.addCollections promise at all on safari iOS. It precisely is hanging here.

I do not see any reason why a call to storageInstance.bulkWrite() can hang up, it should either resolve at some point or throw an error.

Since this is reproducible with the dexie storage, it should be possible to make a PR with a test case?

@KingSora
Copy link
Contributor
KingSora commented Apr 23, 2025

Since this is reproducible with the dexie storage, it should be possible to make a PR with a test case?

I couldn't reproduce the hanging behavior in safair with localstorage, dexie and memory storage. The error I posted above was form the dexie storage but the hanging didn't occur. The indexeddb storage didn't give any errors but it hang the browser.

The PR #7095 fixed the hanging issue for us, but we still think that if one does replicationState.start() at a inconvenient point in time the issue would return.

@KingSora
Copy link
Contributor
KingSora commented Apr 23, 2025

@pubkey We implemented and deployed the fix from PR #7095 and we are getting sentry error reports from the replicationState.pause function:

ensureNotFalsy() is falsy:

So before releasing a new version of rxdb it might be beneficial to wrap the whole thing in a try catch block?

@pubkey
Copy link
Owner Author
pubkey commented Apr 23, 2025

No a try-catch is not the correct solution. We should detect where this comes from. ensureNotFalsy should never ever throw on runtime. This is only used to satisfy typescript.

@KingSora
Copy link
Contributor
KingSora commented Apr 23, 2025

@pubkey its thrown here:

ensureNotFalsy(this.internalReplicationState).events.paused.next(true);
(this.internalReplicationState is undefined)

So maybe before replicationState.pause is being called there should be additional checks or the pause function just does nothing in that case

@pubkey
Copy link
Owner Author
pubkey commented Apr 23, 2025 8000

If .pause is called while .start is still starting up, the .pause call should await the startup-procedure. This likely would also fix your DatabaseClosedError from before.

@KingSora
Copy link
Contributor
KingSora commented Apr 23, 2025

If .pause is called while .start is still starting up, the .pause call should await the startup-procedure. This likely would also fix your DatabaseClosedError from before.

@pubkey well, before replicationState.pause() was never called in the visibilitychange event because of the issue fixed in PR #7095 so the DatabaseClosedError (at least at that time) didn't came form the replicationState.pause() in that event.

I guess its a good idea to implement that, although I don't have the capacity to make PR for now

@space7panda
Copy link
space7panda commented Apr 23, 2025

@KingSora @pubkey our issue also related to start() and pause() but in a bit different way

We are using Capacitor with RxDb and start() pause() are triggered for us in 2 scenarios:

  • when we launch camera
  • when apps get minimized and maximised

Basically users can hide and open app 10 times in a row which will cause 10X combo of start() pause() and thats why we had that big lag on budget phones

@space7panda
Copy link
space7panda commented Apr 23, 2025

@KingSora @pubkey our issue also related to start() and pause() but in a bit different way

We are using Capacitor with RxDb and start() pause() are triggered for us in 2 scenarios:

  • when we launch camera
  • when apps get minimized and maximised

Basically users can hide and open app 10 times in a row which will cause 10X combo of start() pause() and thats why we had that big lag on budget phones

then we tried to simulate that with following pseudo code:

App.addListener('appStateChange', ({ isActive }) => {
    ....
    replicationState.start();
    ....
    replicationState.pause();
    ....
});

And we got same results with a bit less lag

@pubkey
Copy link
Owner Author
pubkey commented May 10, 2025

@space7panda I could not reproduce your problem. Can you make a PR with a test case that simulates this behvior and shows that it causes errors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
@andreuka @pubkey @KingSora @space7panda < 2B2B a class="participant-avatar" data-hovercard-type="user" data-hovercard-url="/users/paul-geisler/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="/paul-geisler"> @paul-geisler
0