Open
Description
What happened + What you expected to happen
Redis health check lacks timeout detection. In the event of a network interface failure, it may not be identified promptly.
like: ifconfig eth0 down
ray 2.34.0
src\ray\gcs\gcs_server\gcs_redis_failure_detector.cc
void GcsRedisFailureDetector::DetectRedis() {
auto redis_callback = [this](const std::shared_ptr<CallbackReply> &reply) {
if (reply->IsNil()) {
RAY_LOG(ERROR) << "Redis is inactive.";
callback_();
}
};
auto cxt = redis_client_->GetPrimaryContext();
cxt->RunArgvAsync({"PING"}, redis_callback);
}