8000 Report errors when weave script talks to weave component via http by bboreham · Pull Request #2096 · weaveworks/weave · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Jun 20, 2024. It is now read-only.

Report errors when weave script talks to weave component via http #2096

Merged
merged 5 commits into from
Apr 5, 2016

Conversation

bboreham
Copy link
Contributor

Fixes #2052.

The output from curl is placed in a temporary file, which is sent to stdout if the result is 2xx and stderr otherwise, except for 404 errors which are specifically handled in various places so call_weave swallows the "404 not found" message.

The "Call to weave failed." message is removed, in favour of whatever actual error came back from curl. It may be that this makes some troubleshooting harder.

Also fixes an issue introduced by #1200 if the user has disabled IPAM with the (undocumented) --ipalloc-range="" whereby each fixed IP address specified would receive a "404" message, and this PR would otherwise have turned that into a failure of the entire command.

@bboreham bboreham added this to the 1.5.0 milestone Mar 24, 2016
@rade
Copy link
Member
rade commented Mar 29, 2016

I rebased this locally and came across the following problems:

$ weave status foo
404 page not found
$
$ weave launch --no-dns
$ weave dns-args
--dns 172.17.0.1 --dns-search=weave.local.

That last one should return nothing post #2101. The code needs adjusting since it currently matches on a "404 ..." response text.

else
res=1
call_weave GET $STATUS_URL || res=$?
if [ $res = 404 ] ; then

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

@rade
Copy link
Member
rade commented Mar 29, 2016

sometimes generates an extra "404" message on the user's screen under error conditions, e.g. if they mis-spell a weave status xxx command.

That needs fixing.

@bboreham bboreham force-pushed the issues/2052-http-errors branch from c31acbc to 0b41ae3 Compare April 1, 2016 12:01
@bboreham
Copy link
Contributor Author
bboreham commented Apr 1, 2016

Rebased and addressed comments.

@bboreham bboreham force-pushed the issues/2052-http-errors branch from 0b41ae3 to 0498e73 Compare April 1, 2016 13:12
@bboreham bboreham removed their assignment Apr 1, 2016
@rade
Copy link
Member
rade commented Apr 3, 2016

I have tweaked the code slightly. Feel free to squash.

Should ipam_reclaim be changed? I can never work out what it does in case of failures, since that also depends on the calling context

Does it ignore errors?
Does it bail on the first error?
Does it error IFF the last reclaim fails?

echo "Call to $CONTAINER_NAME failed." >&2
return 1
fi
http_call $HTTP_ADDR "$@"

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

@bboreham
Copy link
Contributor Author
bboreham commented Apr 4, 2016

I changed ipam_reclaim: it should attempt to claim as many addresses as it can rather than bailing out on the first failure.

It now never returns a failure status for the overall weave launch, which seems justifiable - weave has launched, even though some sub-parts did not complete.

@bboreham bboreham removed their assignment Apr 4, 2016
@bboreham bboreham force-pushed the issues/2052-http-errors branch from e1a1d73 to e5fd56f Compare April 4, 2016 14:10
@rade
Copy link
Member
rade commented Apr 4, 2016

This is consistently failing in CI in 640_proxy_restart_reattaches_test.sh. Running that on my machine, I have tracked the failure down to the check_attached coming after

run_on $HOST1 sudo kill -KILL $(docker_on $HOST1 inspect --format='{{.State.Pid}}' c2)

docker logs weaveproxy shows

INFO: 2016/04/04 16:00:50.609756 Attaching container 346c87f29e0e2478eef75e47275826521e30ae6f7c5f110788c1873cc79c543d with WEAVE_CIDR "" to weave network
WARN: 2016/04/04 16:00:51.187364 Attaching container 346c87f29e0e2478eef75e47275826521e30ae6f7c5f110788c1873cc79c543d to weave network failed: Failure during network configuration for container 346c87f29e0e2478eef75e47275826521e30ae6f7c5f110788c1873cc79c543d:
Error: an inet prefix is expected rather than "cancelled".

That container is the c2 container which is supposed to be restarted. Inspecting it shows...

        "State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 137,
            "Error": "",
            "StartedAt": "2016-04-04T16:08:23.745210944Z",
            "FinishedAt": "2016-04-04T16:08:24.147338761Z"
        },

@rade
Copy link
Member
rade commented Apr 4, 2016

This is consistently failing in CI in 640_proxy_restart_reattaches_test.sh.

Our current theory is that this is just #2090 in spades. @bboreham has seen that test succeed locally; I haven't so far.

@rade
Copy link
Member
rade commented Apr 4, 2016

I cannot shake off the suspicion that something else, specific to this PR, is going on here. We have had 10 successive failures in CI for this PR, which is way higher than the normal failure rate of 640_proxy_restart_reattaches_test.sh.

It could just be that the timing has shifted... call_weave no longer checks whether weave is alive before calling it. That check, involving a docker inspect, was quite expensive. Without that delay, weave attach will proceed to the IP allocation much more quickly, increasing the chances that we hit the race described in #2090.

@bboreham
Copy link
Contributor Author
bboreham commented Apr 5, 2016

When working to understand this, I found the failure incredibly sensitive - adding one more line of logging before the allocate is attempted would make it succeed every time.

I see it failing due to IsContainerNotRunning seeing the Restarting flag.

@bboreham bboreham force-pushed the issues/2052-http-errors branch from e5fd56f to 934ec24 Compare April 5, 2016 16:54
@rade rade merged commit ce431dd into master Apr 5, 2016
@rade rade deleted the issues/2052-http-errors branch April 16, 2016 09:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

errors returned by weave http api are ignored by weave script
2 participants
0