-
Notifications
You must be signed in to change notification settings - Fork 932
Add CLUSTER FLUSHSLOT command #1384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #1384 +/- ##
============================================
- Coverage 71.21% 71.11% -0.11%
============================================
Files 122 122
Lines 66049 66067 +18
============================================
- Hits 47038 46984 -54
- Misses 19011 19083 +72
🚀 New features to boost your workflow:
|
Please help review this PR. Thank you. @hwware @madolson @PingXie @zuiderkwast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Thanks @wuranxx!
c66cb6a
to
ab74efb
Compare
Before I rebased to the latest unstable (old commit is 32f7541) , the Tcl tests and manual command tests ran fine locally. After I rebased to the latest unstable 6df376d, the command failed when executed locally. It is possible that changes in the code during this period caused the inconsistent behavior. Until I confirm the cause, I have marked this PR as WIP (Work In Progress). |
ab74efb
to
4db74af
Compare
Fixed. For details, see #1399. |
@enjoy-binbin @murphyjacob4 Do you want this command for atomic slot migration, for example when a slot migration is cancelled, the importing node can send CLUSTER FLUSHSLOT to its replicas to clean up the slot? |
I think we will have a protocol for cancelling the slot migration, and this will trigger the drop for all slot ranges being migrated locally. But we should be able to use the async |
@murphyjacob4 The replicas are not involved in the slot migration protocol IIUC. So when a primary cleans up a cancelled slot import, do you just replicate a lot of DEL commands to the replicas? Replicating a single FLUSHSLOT may save some work and bandwidth... |
Yeah that's a good idea. Currently, this is what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wuranxx and sorry for the delay.
It looks strait-forward, but there are some details we should think about a bit more. I'm not actually sure what's best to do in all cases, but see my comments below.
src/cluster_legacy.c
Outdated
if (lazy) { | ||
dbAsyncDelete(&server.db[0], key); | ||
} else { | ||
dbSyncDelete(&server.db[0], key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the deleted keys are replicated just by multiple DEL commands. For comparison, FLUSHDB is replicated as a single FLUSHDB command, which is much less traffic than many DEL commands. If we want to replicate FLUSHSLOT as a single FLUSHSLOT command, then we should probably check that the replicas supports this though. We already have the replica's version stored in c->repl_data->replica_version
(REPLCONF VERSION is sent by the replica since 8.0) so we could check that it's >= 8.1 for all connected replicas. It's also fine to skip this. We can do it later as a separate optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think we don't have to worry about old replicas. This is a new feature and users should use new features only after all nodes have been upgraded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For atomic slot migration - we will need send some FLUSHSLOT commands implicitly:
- If an import doesn't succeed (is failed or cancelled) we need to flush the contents of the staged slots to replicas
- If there is an import to a target primary and FLUSHDB is run on that target, the target will need to break this out into FLUSHSLOT commands to avoid the importing slots being flushed
- If there is an export happening from some source primary and FLUSHDB is run, we need to break this out into FLUSHSLOT commands (so that the target only flushes the impacted importing slots)
Plus - in the case that a dirty slot is found - we currently delete it by sending individual key deletions. It would make sense to create the FLUSHSLOT command implicitly to improve replication traffic.
Due to the cases of implicit FLUSHSLOT propagation - I would suggest that the "delKeysInSlot" helper method include logic for deciding about if we should propagate as individual deletions or as FLUSHSLOT based on the capabilities of the replicas. I'm sure for many customers, they may be running in a cross-version deployment for some amount of time while they do rolling upgrades
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will need send some FLUSHSLOT commands implicitly
Yes, so do you need any changes here? You can call this function and then you explicitly replicate a FLUSHSLOT command. Or do you want this function to do it?
I think you may want to skip keyspace events and skip replication using DEL commands. With the is_cmd
argument you can't disable both. See other comment.
include logic for deciding about if we should propagate as individual deletions or as FLUSHSLOT based on the capabilities of the replicas. I'm sure for many customers, they may be running in a cross-version deployment for some amount of time while they do rolling upgrades
I agree that we should support cross-version deployments, but in this case I'm not worried and I don't think we need to decide based on replica capabilities.
We require that all nodes are upgraded before you start using new features. For example, an old replica wouldn't understand a new datatype, or information like hash field expiration, etc. The FLUSHSLOT command and the atomic slot migration are both new features. If a node has ASM, it also has FLUSHSLOT. If you have an 8.1 replica in your cluster, you shouldn't start using these features just yet. Makes sense?
4db74af
to
0d751a2
Compare
I add command scene to delKeysInSlot. In command sence, delKeysInSlot will not propagate del command, but will notify del event. flushslot now just propagate please help review this PR, thank you! @zuiderkwast @PingXie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. We should also add official documentation for this new command.
R 0 CLUSTER FLUSHSLOT $key_slot ASYNC | ||
assert_equal [R 0 CLUSTER COUNTKEYSINSLOT $key_slot] 0 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case to validate the command is indeed replicated as-is, as opposed to the key deletions getting replicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. Added.
c76db56
to
39d045b
Compare
Weekly meeting notes. We will keep the command as is, but we will document this as primarily an internal and debugging command. |
I think we discussed two decisions
based on the use case described in the issue, this is clearly not a "data access" command meant for the application developers, hence the guidance of keeping it under
I don't remember if we reached a conclusion? |
Thanks Ping for summarizing the discussion. We did reach the conclusion to stick to Neither of these commands are frequently used so the similarity between the two commands ( |
I forgot if you drop off meeting earlier, we have one conclusion that the new command "cluster flushslot" as an internal command. |
Signed-off-by: wuranxx <wuranxx@foxmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@murphyjacob4 Flush one slot at a time, is that enough for cancelled atomic slot migration? If you have 1000 ongoing slot migrations that are cancelled, will you need to send 1000 flushslot commands to the replicas at the same time?
I think it is fine. ASM actually has a few scenarios for FLUSHSLOT:
In the worst case, it would be something like ~700KiB of replication data to send 16384 FLUSHSLOT commands (each would be <= 43 bytes, |
@murphyjacob4 Now that we have multi-db in cluster mode, FLUSHSLOT will affects all databases. FLUSHDB should only flush the currently SELECT'ed database. |
Signed-off-by: wuranxx <wuranxx@foxmail.com>
We have consensus that flushslot is an internal command only, I think case 2 and 3 will not happen. How do you think? |
Agree, but I'm not sure about case 4 and 5. Maybe we need |
Signed-off-by: wuranxx <wuranxx@foxmail.com>
I think case 4 is a valid scenario. However, for case 5, it depends on how the operator does. One worst case is causing data inconsistent. (source primary is exporting data, and FLUSHDB command runs target primary, then some data will lost) |
For the rollback of atomic slot migration, using |
I think the proper command format as below: CLUSTER FLUSHSLOT [[DB id] | all] [ASYNC|SYNC] if there is no any argument after , this command will only apply current db, it is aligned with FLUSHDB I add this PR in our next Monday meeting agenda. |
@valkey-io/core-team Shouldn't this have had a major decision tag? I remember we decided this was an internal command, but it doesn't look like it was really restricted to just be internal. |
The issue has the major decision tag. Seemed enough...(?)
If we don't add a markdown file for it in the docs, it will not show up in the docs, but maybe it will show up somewhere else? How can we make it completely internal? |
Add cluster flushslot command. Closes valkey-io#1133. Syntax: CLUSTER FLUSHSLOT <slot> [ASYNC|SYNC] ## Background We want to implement the ability to SYNC or ASYNC free all data from a single slot. This is useful in two cases: 1. When a primary loses ownership of a slot, it will temporarily freeze while it synchronously frees the slot. 2. When doing resharding, you may prefer to forcibly delete all the data from a slot and assign it to another node. ## Internal changes Modify the function signature of `delKeysInSlot`, adding parameters `lazy`, `propagate_del` and `send_del_event` to control the behavior. --------- Signed-off-by: wuranxx <wuranxx@foxmail.com> Signed-off-by: hwware <wen.hui.ware@gmail.com> Signed-off-by: wuranxx <155042264+wuranxx@users.noreply.github.com> Co-authored-by: hwware <wen.hui.ware@gmail.com> Co-authored-by: Ping Xie <pingxie@outlook.com> Signed-off-by: chzhoo <czawyx@163.com>
Add cluster flushslot command. Closes valkey-io#1133. Syntax: CLUSTER FLUSHSLOT <slot> [ASYNC|SYNC] ## Background We want to implement the ability to SYNC or ASYNC free all data from a single slot. This is useful in two cases: 1. When a primary loses ownership of a slot, it will temporarily freeze while it synchronously frees the slot. 2. When doing resharding, you may prefer to forcibly delete all the data from a slot and assign it to another node. ## Internal changes Modify the function signature of `delKeysInSlot`, adding parameters `lazy`, `propagate_del` and `send_del_event` to control the behavior. --------- Signed-off-by: wuranxx <wuranxx@foxmail.com> Signed-off-by: hwware <wen.hui.ware@gmail.com> Signed-off-by: wuranxx <155042264+wuranxx@users.noreply.github.com> Co-authored-by: hwware <wen.hui.ware@gmail.com> Co-authored-by: Ping Xie <pingxie@outlook.com> Signed-off-by: shanwan1 <shanwan1@intel.com>
Add cluster flushslot command. Closes #1133.
Syntax:
Background
We want to implement the ability to SYNC or ASYNC free all data from a single slot. This is useful in two cases:
Internal changes
Modify the function signature of
delKeysInSlot
, adding parameterslazy
,propagate_del
andsend_del_event
to control the behavior.