Reparenting is the process of changing a shard's master tablet from one host to another or changing a slave tablet to have a different master. Reparenting can be initiated manually or it can occur automatically in response to particular database conditions. As examples, you might reparent a shard or tablet during a maintenance exercise or automatically trigger reparenting when a master tablet dies.
This document explains the types of reparenting that Vitess supports:
Note: The InitShardMaster command defines the initial
parenting relationships within a shard. That command makes the specified
tablet the master and makes the other tablets in the shard slaves that
replicate from that master.
Vitess supports MySQL 5.6, MySQL 5.7 and MariaDB 10.0 implementations.
Vitess requires the use of global transaction identifiers (GTIDs) for its operations:
Vitess does not depend on semisynchronous replication but does work if it is implemented. Larger Vitess deployments typically do implement semisynchronous replication.
You can use the following vtctl
commands to perform reparenting operations:
Both commands lock the shard for write operations. The two commands
cannot run in parallel, nor can either command run in parallel with the
InitShardMaster
command.
The two commands are both dependent on the global topology server being
available, and they both insert rows in the topology server's
_vt.reparent_journal table. As such, you can review
your database's reparenting history by inspecting that table.
The PlannedReparentShard command reparents a healthy master
tablet to a new master. The current and new master must both be up and
running.
This command performs the following actions:
Shard object's
MasterAlias record.In this scenario, the old master's tablet type transitions to
spare. If health checking is enabled on the old master,
it will likely rejoin the cluster as a replica on the next health
check. To enable health checking, set the
target_tablet_type parameter when starting a tablet.
That parameter indicates what type of tablet that tablet tries to be
when healthy. When it is not healthy, the tablet type changes to
spare.
The EmergencyReparentShard command is used to force
a reparent to a new master when the current master is unavailable.
The command assumes that data cannot be retrieved from the current
master because it is dead or not working properly.
As such, this command does not rely on the current master at all to replicate data to the new master. Instead, it makes sure that the master-elect is the most advanced in replication within all of the available slaves.
Important: Before calling this command, you must first identify
the slave with the most advanced replication position as that slave
must be designated as the new master. You can use the
vtctl ShardReplicationPositions
command to determine the current replication positions of a shard's slaves.
This command performs the following actions:
master, the master-elect
performs any other changes that might be required for its new state.MasterAlias record of the global
Shard object.External reparenting occurs when another tool handles the process
of changing a shard's master tablet. After that occurs, the tool
needs to call the
vtctl TabletExternallyReparented
command to ensure that the topology server, replication graph, and serving
graph are updated accordingly.
That command performs the following operations:
Shard object from the global topology server.show slave status command, ultimately aiming to confirm
that the MySQL reset slave command already executed on
the tablet.spare to ensure that it does not interfere with ongoing
operations.Shard object to specify the new master.The TabletExternallyReparented command fails in the following
cases:
Active reparenting might be a dangerous practice in any system
that depends on external reparents. You can disable active reparents
by starting vtctld with the
--disable_active_reparents flag set to true.
(You cannot set the flag after vtctld is started.)
A tablet can be orphaned after a reparenting if it is unavailable
when the reparent operation is running but then recovers later on.
In that case, you can manually reset the tablet's master to the
current shard master using the
vtctl ReparentTablet
command. You can then restart replication on the tablet if it was stopped
by calling the vtctl StartSlave
command.