8000 Driver hangs forever on unreachable IP · Issue #52 · crate/crate-pdo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Driver hangs forever on unreachable IP #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeteon opened this issue Nov 9, 2016 · 4 comments
Closed

Driver hangs forever on unreachable IP #52

jeteon opened this issue Nov 9, 2016 · 4 comments

Comments

@jeteon
Copy link
jeteon commented Nov 9, 2016

If one of the first servers specified in the list of servers to try is on an unreachable IP address, then the PDO driver hangs on this first connection attempt and doesn't proceed to try the other servers as would be expected. To be fair, the connection does eventually time out after something like a minute but upstream timeouts have long given up on the request by that point.

This might seem far fetched but has happened to me recently on a production deployment where an interface on the server (needed to reach the IP space of the Crate server) failed. This hung the PHP application rather than moving onto other accessible Crate servers in the list as I would have expected. Part of the reason for using Crate was for this fail-over potential in these cases so this was a big deal to us.

I think it may be a case of a connection timeout being set to a very high number, not configurable via the API somewhere in the code base but I'm not sure where. I noticed you aren't setting the connect_timeout key anywhere in the code base and this defaults to "forever". However, changing this didn't seem to help in my case.

@celaus
Copy link
Contributor
celaus commented Nov 10, 2016

Hi @jeteon - we are excited to hear that you are running CrateDB in production! However we are less excited about the issue(s) you are having.. can you maybe provide a short code sample where the driver blocks when a node is unresponsive?

Also, could you find out the reason why the node was unresponsive? Apart from the driver issue, there are maybe other things that could be done to avoid that problem in the future :)

Cheers, Claus

@jeteon
Copy link
Author
jeteon commented Nov 10, 2016

Hi @celaus. The server itself remains responsive but the application server will hang on that particular request. It times out eventually but by then our front-end server has timed out the request. There is a mitigation for it currently in place (basically, I test the connectivity separately) but it's not ideal. The below code is a single file example that demonstrates the issue:

<?php

require 'vendor/autoload.php';

use Crate\PDO\PDO as PDO;

$pdo = new PDO('crate:192.0.2.0:4200,127.0.0.1:4200', null, null, null);
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $pdo->prepare('SELECT * FROM a_schema.a_table');
$stmt->execute();

echo "Query executed.";

The code assumes you have a Crate instance running on localhost, port 4200. It basically goes in a file test.php in a directory that you then run composer require crate/crate-pdo:~0.3.1 in. Executing php test.php should demonstrate the issue.

On my system (running PHP 7) the connection hangs and then the script execution will end with a fatal error after over a minute:

PHP Fatal error:  Uncaught GuzzleHttp\Ring\Exception\ConnectException: cURL error 7: Failed to connect to 192.0.2.0 port 4200: Connection timed out in /tmp/test/vendor/guzzlehttp/ringphp/src/Client/CurlFactory.php:126

My expectation would be that this would time out (in about 5 seconds going by the source) and then proceed to run the query on the next server in line.

@jeteon
Copy link
Author
jeteon commented Nov 10, 2016

I discovered that if I leave out the version in the composer line then I get version 0.6.0 instead and the test runs as expected. I'm going to try to move the code base to that version. Is there any reason the documentation recommends version 0.3.1?

@jeteon
Copy link
Author
jeteon commented Nov 10, 2016

Confirmed things work properly on the current release 0.5.1 as well. Sorry about the hassle. Seems like a documentation thing more than a code thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0