8000 Added support to run etcd client calls only when in multi-node case by aaronfern · Pull Request #504 · gardener/etcd-backup-restore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Added support to run etcd client calls only when in multi-node case #504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 19, 2022

Conversation

aaronfern
Copy link
Contributor
@aaronfern aaronfern commented Jul 13, 2022

What this PR does / why we need it:
This PR adds support to allow etcd calls only if backup-restore determines that it is in a multi-node setup.
This removes dependencies in a single node case of requiring network policies for etcd calls to succeed

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Release note:

Fixed a bug where etcd calls related to multi node operation were used in single node operation

@aaronfern aaronfern requested a review from a team as a code owner July 13, 2022 11:19
@gardener-robot gardener-robot added needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jul 13, 2022
@aaronfern aaronfern added reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies and removed needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jul 13, 2022
@gardener-robot gardener-robot added the needs/review Needs review label Jul 13, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 13, 2022
@aaronfern
Copy link
Contributor Author

/work-in-progress

@gardener-robot
Copy link

@aaronfern Command /work-in-progress is not known.

@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 13, 2022
@gardener-robot gardener-robot added the size/s Size of pull request is small (see gardener-robot robot/bots/size.py) label Jul 14, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 14, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 14, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 14, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 14, 2022
Copy link
Member
@timuthy timuthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me in general. I added some comments to improve DRY. Please add a proper release note as well.

Comment on lines 345 to 349
etcdConfigForTest := os.Getenv("ETCD_CONF")
if etcdConfigForTest != "" {
return false
}
inputFileName = "/var/etcd/config/etcd.conf.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be transferred into a utility function. I also see this code being duplicated multiple times in the code base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also make other areas in the code use this utility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Jul 14, 2022
@@ -338,6 +338,36 @@ func ProbeEtcd(ctx context.Context, clientFactory etcdClient.Factory, logger *lo
return nil
}

// IsMultiNode determines whether a pod is part of a multi node setup or not
// This is determined by checking the `initial-cluster` of the etcd configmap to check the number of members expected.
func IsMultiNode() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can align this more with that is already used here

OriginalClusterSize: len(initialClusterMap),
to not have logic duplication all over the place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/s 8000 Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jul 14, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 14, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 18, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 18, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 18, 2022
Copy link
Member
@timuthy timuthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for the improvements 👍

if etcdConfigForTest != "" {

inputFileName = GetConfigFilePath()
if inputFileName != EtcdConfigFilePath {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this check is required or was not there before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ideally if we are in test-mode (ETCD_CONF is set) then we return an empty service endpoint so that the default one is used (http://127.0.0.1:2379)
We send a proper service endpoint only when running in a cluster

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I can't really judge this case but it seems very implicit and also confusing (because why should we then supprt setting a config via ETCD_CONF ?!). At a minimum a comment would be nice why such a check is performed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand, and we will be picking this up and finding a better way to implement this while refactoring the code base
For the time being I have added a comment here

err := m.PromoteMember(ctx)
if err == nil {
break
if miscellaneous.IsMultiNode(b.logger) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rather use the restoreOpts.OriginalClusterSize here, so that we don't have to decode the configYML again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you're right, that makes sense
Will make that change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 18, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 18, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 18, 2022
Copy link
Member
@timuthy timuthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/changes Needs (more) changes needs/review Needs review labels Jul 18, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 18, 2022
@timuthy timuthy merged commit 40e9368 into gardener:master Jul 19, 2022
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Jul 19, 2022
aaronfern added a commit to aaronfern/etcd-backup-restore that referenced this pull request Jul 21, 2022
…ardener#504)

* Added support to run etcd client calls only when in multi-node case

* Addressed review comments

* Used values from restoreOpts rather than another func call
aaronfern added a commit that referenced this pull request Jul 21, 2022
* Added support to run etcd client calls only when in multi-node case (#504)

* Added support to run etcd client calls only when in multi-node case

* Addressed review comments

* Used values from restoreOpts rather than another func call

* Temp fix: skip single member restoration in case of data-dir invalid. (#501)

* Temp fix: skip single member restoration in case of data-dir invalid.

* Small fixes.

* Improved error log.

* Fixed a bug in scaleup feature which present in func: IsMemberInCluster.

* Fix the unit tests.

* Address the review comments.

* Handle the case of when storageProvider is not configured.

* Address the review comments.

* Address the review comments.

* Assigned the correct Peer Address to Etcd after it restores from the backup-bucket. (#505)

* Assigned the correct Peer address to the Etcd after restoration.

* Fix the unit tests and some refactoring.

* Improve some logs.

* Address the review comments.

* To update the peer address of existing cluster.

* Don't update peer URL when promoting member (#506)

* Don't update peer URL when promoting member

* Update interface name in member package

Co-authored-by: Ishan Tyagi <42602577+ishan16696@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0