8000 Issue with anti affinity rules · Issue #1036 · Altinity/clickhouse-operator · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Issue with anti affinity rules #1036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vigodeltoro opened this issue Oct 31, 2022 · 7 comments
Open

Issue with anti affinity rules #1036

vigodeltoro opened this issue Oct 31, 2022 · 7 comments

Comments

@vigodeltoro
Copy link

Hi there,

I have a problem with the anti affinity rules.. maybe there is somebody out there who can help me out.
I have a three nodes Kubernetes setup with a 3 shard cluster with one replica each..

So there 6 pods at the cluster. I try to use anti affinity rules to distribute the pods to the 3 nodes. My goal is to have 2 pods per node but not the same shard or the same replica. Something like the the example below..

chi-protobuf-example-dev-0-0-0 node1
chi-protobuf-example-dev-0-1-0 node2
chi-protobuf-example-dev-1-0-0 node3
chi-protobuf-example-dev-1-1-0 node1
chi-protobuf-example-dev-2-0-0 node2
chi-protobuf-example-dev-2-1-0 node3

The anti affinity rules I'm using are like the example of the /docs/chi-examples dir ( https://github.com/Altinity/clickhouse-operator/blob/master/docs/chi-examples/99-clickhouseinstallation-max.yaml)

podTemplates:
- name: pod-template-with-init-container
podDistribution:
- type: ShardAntiAffinity
- type: MaxNumberPerNode
number: 2
topologyKey: "kubernetes.io/hostname"
- type: ReplicaAntiAffinity
- type: MaxNumberPerNode
number: 2
topologyKey: "kubernetes.io/hostname"

But what's happening every time I deploy is:

chi-protobuf-example-dev-0-0-0 node1
chi-protobuf-example-dev-0-1-0 node2
chi-protobuf-example-dev-1-0-0 node1
chi-protobuf-example-dev-1-1-0 node2
chi-protobuf-example-dev-2-0-0 node3
chi-protobuf-example-dev-2-1-0 Pending because no free node is available

That's really problematic, because I can't use my resources properly..

Does anybody has an idea ?

Thanks a lot and best regards

@alex-zaitsev
Copy link
Member

You only need ReplicaAntiAffinity and that's it.

    - name: pod-template-with-init-container
      podDistribution:
      - scope: ClickHouseInstallation
        type: ReplicaAntiAffinity
        topologyKey: "kubernetes.io/hostname"

@vigodeltoro
Copy link
Author

Hi Alex,

okay.. thanks a lot for that hint. I tried it out and achieved the following distribution

chi-protobuf-example-dev-0-0-0 node1
chi-protobuf-example-dev-0-1-0 node1
chi-protobuf-example-dev-1-0-0 node2
chi-protobuf-example-dev-1-1-0 node2
chi-protobuf-example-dev-2-0-0 node3
chi-protobuf-example-dev-2-1-0 node3

With that I got a distribution over all three nodes but shard and replica are on the same node which means that if I loose one node I loose 30% of my database, so redundancy is gone..

If I try it with :

  • name: pod-template-with-init-container
    podDistribution:
    - scope: ClickHouseInstallation
    type: ShardAntiAffinity
    topologyKey: "kubernetes.io/hostname"

I got only a 2 node distribution:

chi-protobuf-example-dev-0-0-0 node1
chi-protobuf-example-dev-0-1-0 node2
chi-protobuf-example-dev-1-0-0 node1
chi-protobuf-example-dev-1-1-0 node2
chi-protobuf-example-dev-2-0-0 node1
chi-protobuf-example-dev-2-1-0 node2

Do you have any other suggestions ?

Thanks a lot

@prashant-shahi
Copy link
8000

@alex-zaitsev There seems to be lack of proper docs on podDistribution with list of all possible values of each keys and their significance.

@vigodeltoro
Copy link
Author

@prashant-shahi
Indeed.. and in my eyes there is a bug in circular replication..

I was able to workaround that problem with

"hardcoded" pod templates


podTemplates:
  - name: sh0-rep0-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-1"                          

  name: sh0-rep1-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-2"         

   - name: sh1-rep0-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-2"

   - name: sh1-rep1-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-3"

   - name: sh2-rep0-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-3"

   - name: sh2-rep1-template     
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - "zone-1"

But with that I'm facing issues with the podDisruptionBudgets ( #1081)

So a fix would be really helpful..

@karthik-thiyagarajan
Copy link

In a setup of 6 shards and 3 replica setup with two different cluster - (cluster-01 and cluster-02) and I tried maxnumberpernode as 2 and used replicaantiaffinity but still i see 6 pods are getting scheduled (3 from cluster-01 and 3 from cluster-02) which is unexpected. I used topology key as "kubernetes.io/hostname". Can someone help ?

@karthik-thiyagarajan
Copy link

@alex-zaitsev what do you mean by the statement "You only need ReplicaAntiAffinity". I thought we may need only shard ShardAntiAffinity which means replicas of the same shard repel each other and go away. If we do ReplicaAntiAffinity, there is still a risk of same different replicas of same shard sit on the same node. Is it not true ?

@aep
Copy link
aep commented May 21, 2025

according to this https://github.com/Altinity/clickhouse-operator/blob/master/docs/chi-examples/99-clickhouseinstallation-max.yaml#L506 you are correct and it should be ShardAntiAffinity not ReplicaAntiAffinity.

i tested ShardAntiAffinity and this appears to do the correct thing.
the distribution ends up being terrible anyway with multiple shards AND replicas so doing it by hand seems like the way to go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
0