-
Notifications
You must be signed in to change notification settings - Fork 476
Cannot create new cluster with 25.3.x.x #1707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
could you provide full error message from clickhouse-server container with stacktrace? |
the logs:
|
could you share your
|
yaml: apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration:__REMOVED__
creationTimestamp: "2025-05-14T11:02:10Z"
finalizers:
- finalizer.clickhouseinstallation.altinity.com
generation: 1
labels:
name: clickhouse
name: clickhouse
namespace: clickhouse
resourceVersion: "68277"
uid: 069026e5-2727-4222-8817-2ca83e474974
spec:
configuration:
clusters:
- layout:
replicasCount: 3
shardsCount: 1
name: clickhouse
profiles:
clickhouse_operator/http_connection_timeout: 10
clickhouse_operator/log_queries: 0
clickhouse_operator/max_concurrent_queries_for_all_users: 0
clickhouse_operator/os_thread_priority: 0
clickhouse_operator/skip_unavailable_shards: 1
default/allow_experimental_analyzer: 1
default/allow_experimental_bigint_types: 1
default/allow_experimental_database_replicated: 1
default/allow_experimental_projection_optimization: 1
default/compile_aggregate_expressions: 1
default/connect_timeout_with_failover_ms: 2000
default/distributed_aggregation_memory_efficient: 1
default/insert_quorum: 2
default/join_algorithm: parallel_hash
default/join_use_nulls: 1
default/log_queries: 1
default/log_query_threads: 0
default/optimize_arithmetic_operations_in_aggregate_functions: 1
default/parallel_view_processing: 1
default/short_circuit_function_evaluation: force_enable
readonly/allow_experimental_analyzer: 1
readonly/allow_experimental_bigint_types: 1
readonly/allow_experimental_database_replicated: 1
readonly/allow_experimental_projection_optimization: 1
readonly/compile_aggregate_expressions: 1
readonly/connect_timeout_with_failover_ms: 2000
readonly/distributed_aggregation_memory_efficient: 1
readonly/insert_quorum: 2
readonly/join_algorithm: parallel_hash
readonly/join_use_nulls: 1
readonly/log_queries: 1
readonly/log_query_threads: 0
readonly/optimize_arithmetic_operations_in_aggregate_functions: 1
readonly/parallel_view_processing: 1
readonly/readonly: 2
readonly/short_circuit_function_evaluation: force_enable
settings:
default_session_timeout: 1
remote_servers/clickhouse/secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
shutdown_wait_unfinished_queries: 1
users:
default/password_sha256_hex: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
readonly/networks/ip:
- ::/0
- 0.0.0.0/0
readonly/password_sha256_hex: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
readonly/profile: readonly
readonly/quota: default
root/access_management: 1
root/networks/ip:
- ::/0
- 0.0.0.0/0
root/password_sha256_hex: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
root/profile: default
root/quota: default
zookeeper:
nodes:
- host: zk-0.zk.clickhouse.svc.cluster.local
port: 2181
- host: zk-1.zk.clickhouse.svc.cluster.local
port: 2181
- host: zk-2.zk.clickhouse.svc.cluster.local
port: 2181
defaults:
storageManagement:
provisioner: Operator
templates:
dataVolumeClaimTemplate: clickhouse-data-pvc
podTemplate: clickhouse-pod
serviceTemplate: clickhouse-svc
reconciling:
policy: wait
templates:
podTemplates:
- name: clickhouse-pod
podDistribution:
- topologyKey: kubernetes.io/hostname
type: ClickHouseAntiAffinity
- topologyKey: failure-domain.beta.kubernetes.io/zone
type: ReplicaAntiAffinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: optimized-for-clickhouse
operator: In
values:
- "true"
topologyKey: kubernetes.io/hostname
containers:
- image: clickhouse/clickhouse-server:25.3.3.42
name: clickhouse-server
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-data-pvc
- mountPath: /var/lib/clickhouse-cold
name: clickhouse-cold-data-pvc
- command:
- /bin/clickhouse-backup
- server
env:
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
key: log_level
name: clickhousebackup-config
- name: CLICKHOUSE_HOST
valueFrom:
configMapKeyRef:
key: clickhouse_host
name: clickhousebackup-config
- name: CLICKHOUSE_PORT
valueFrom:
configMapKeyRef:
key: clickhouse_port
name: clickhousebackup-config
- name: CLICKHOUSE_USERNAME
valueFrom:
configMapKeyRef:
key: clickhouse_username
name: clickhousebackup-config
- name: CLICKHOUSE_PASSWORD
valueFrom:
secretKeyRef:
key: clickhouse_password
name: clickhousebackup-secret
- name: CLICKHOUSE_USE_EMBEDDED_BACKUP_RESTORE
valueFrom:
configMapKeyRef:
key: use_embedded_backup_restore
name: clickhousebackup-config
- name: ALLOW_EMPTY_BACKUPS
valueFrom:
configMapKeyRef:
key: allow_empty_backups
name: clickhousebackup-config
- name: API_LISTEN
valueFrom:
configMapKeyRef:
key: api_listen
name: clickhousebackup-config
- name: API_CREATE_INTEGRATION_TABLES
valueFrom:
configMapKeyRef:
key: api_create_integration_tables
name: clickhousebackup-config
- name: BACKUPS_TO_KEEP_REMOTE
valueFrom:
configMapKeyRef:
key: backups_to_keep_remote
name: clickhousebackup-config
- name: REMOTE_STORAGE
valueFrom:
configMapKeyRef:
key: remote_storage
name: clickhousebackup-config
- name: GCS_EMBEDDED_ACCESS_KEY
valueFrom:
configMapKeyRef:
key: gcs_embedded_access_key
name: clickhousebackup-config
- name: GCS_EMBEDDED_SECRET_KEY
valueFrom:
secretKeyRef:
key: gcs_embedded_secret_key
name: clickhousebackup-secret
- name: GCS_BUCKET
valueFrom:
configMapKeyRef:
key: gcs_bucket
name: clickhousebackup-config
- name: UPLOAD_CONCURRENCY
valueFrom:
configMapKeyRef:
key: upload_concurrency
name: clickhousebackup-config
- name: CLICKHOUSE_TIMEOUT
valueFrom:
configMapKeyRef:
key: timeout
name: clickhousebackup-config
- name: CLICKHOUSE_SKIP_TABLES
valueFrom:
configMapKeyRef:
key: skip_tables
name: clickhousebackup-config
- name: S3_MAX_PARTS_COUNT
value: "32"
image: altinity/clickhouse-backup:2.6.15
imagePullPolicy: Always
name: clickhouse-backup
ports:
- containerPort: 7171
name: backup-rest
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-data-pvc
- mountPath: /var/lib/clickhouse-cold
name: clickhouse-cold-data-pvc
tolerations:
- effect: NoSchedule
key: clickhouse
operator: Exists
- effect: NoSchedule
key: kubernetes.io/arch
operator: Equal
value: arm64
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
serviceTemplates:
- generateName: '{chi}'
name: clickhouse-svc
spec:
ClusterIP: ""
ports:
- name: http
port: 8123
- name: client
port: 9000
type: ClusterIP
volumeClaimTemplates:
- name: clickhouse-data-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: clickhouse-ext4fs
- name: clickhouse-cold-data-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 130Gi
storageClassName: clickhouse-cold-ext4fs
status:
chop-commit: 2a4b0f2
chop-date: 2025-03-14T11:41:00
chop-ip: 10.96.2.10
chop-version: 0.24.5
clusters: 1
endpoint: clickhouse.clickhouse.svc.cluster.local
fqdns:
- chi-clickhouse-clickhouse-0-0.clickhouse.svc.cluster.local
- chi-clickhouse-clickhouse-0-1.clickhouse.svc.cluster.local
- chi-clickhouse-clickhouse-0-2.clickhouse.svc.cluster.local
hosts: 3
pods:
- chi-clickhouse-clickhouse-0-0-0
- chi-clickhouse-clickhouse-0-1-0
- chi-clickhouse-clickhouse-0-2-0
shards: 1
status: Aborted
taskID: a209fbe8-716b-48f8-afe1-9672c6919845
taskIDsCompleted:
- a209fbe8-716b-48f8-afe1-9672c6919845
taskIDsStarted:
- a209fbe8-716b-48f8-afe1-9672c6919845 |
@marcio-absmartly, thank you for heads up. Operator 0.25.0 has not been released yet. We will re-check it against CH 25.3 before the release. Meanwhile, we are running a number of 25.3+ clusters with operator 0.24.5 with no issues. |
Update: we could not reproduce it in tests. But code allows empty remote_servers to appear, so it could be a race condition. We will make sure empty remote_servers would not be created at all |
It works fine with already existing clusters. It's when creating a new one that this happens. For us it was happening consistently, the only way to get past it was to create the cluster with 24.8 and then upgrade to 25.3. |
@marcio-absmartly , could you check it with 0.25.0 operator version? |
Should be fixed in 0.25.0 |
When creating a new cluster with CH 25.3.x.x. the first host won't boot up, because the setting
clickhouse.remote_servers
is not setup properly.My suspicion is that the operator populates this setting after each host is running, but newer clickhouse versions validate and expect the cluster setup beforehand.
This is verified to work on CH 24.8.x.x
Operator version is 0.24.5
The text was updated successfully, but these errors were encountered: