8000 Tail (-t) breaks if aws spend more than 5 seconds to start the stack. · Issue #515 · cloudtools/stacker · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tail (-t) breaks if aws spend more than 5 seconds to start the stack. #515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Evnsan opened this issue Nov 29, 2017 · 5 comments
Closed

Comments

@Evnsan
Copy link
Evnsan commented Nov 29, 2017

Hey folks, I am facing this problem when trying to deploy a stack with asg and codedeploy. Do you think it's a big deal?

[2017-11-29T15:10:55] Tailing stack: alb-test-1-app
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 541, in tail_stack
    self.tail_stack(stack, retries=retries + 1, **kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 541, in tail_stack
    self.tail_stack(stack, retries=retries + 1, **kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 541, in tail_stack
    self.tail_stack(stack, retries=retries + 1, **kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 541, in tail_stack
    self.tail_stack(stack, retries=retries + 1, **kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 541, in tail_stack
    self.tail_stack(stack, retries=retries + 1, **kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 535, in tail_stack
    include_initial=False)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 577, in tail
    initial_events = self.get_events(stack_name)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 562, in get_events
    StackName=stackname
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/evnsan/git/cobli/deploy/cloudformation/cloudform-venv/local/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
ClientError: An error occurred (ValidationError) when calling the DescribeStackEvents operation: Stack [alb-test-1-app] does not exist

stacker/providers/aws/default.py:

 24 MAX_TAIL_RETRIES = 5
.
.
.
532         try:                                                                    
533             self.tail(stack.fqn,                                                
534                       log_func=log_func,                                        
535                       include_initial=False)                                    
536         except botocore.exceptions.ClientError as e:                            
537             if "does not exist" in e.message and retries < MAX_TAIL_RETRIES:    
538                 # stack might be in the process of launching, wait for a second 
539                 # and try again                                                 
540                 time.sleep(1)                                                   
541                 self.tail_stack(stack, retries=retries + 1, **kwargs)           
542             else:                                                               
543                 raise
@phobologic
Copy link
Member

Hey @Evnsan - sorry for the delay in getting back to you, the re:Invent + the holidays took up a lot of my focus. This does seem like an issue, though honestly I'm not sure what the solution is. We could increase the # of retries it allows, though I'm not sure what the right balance is.

To be honest, we rarely (hardly ever) use --tail at Remind, so I'd love a little guidance on what you (and others who do use it) believe would be the best behavior here. Thanks!

@ajk8
Copy link
ajk8 commented Mar 5, 2018

I use it all the time. It is very nice to see the events as they flow past. My 2c: retry every 5 seconds instead of every 1, and keep the number of retries at 5. I think the 1-second loop is unnecessary, and there's no need to spend your API limits on that.

On a related note, this exception now occurs (as of 1.2.0) when a stack is in a completed teardown state. So, if a stack is skipped because it didn't exist, the exception will show up. If you just reach the end of the tail on a destroy operation, the exception will show up. I've been in the code trying to figure out where to catch these issues, but no joy yet. Hopefully someone with deeper knowledge of the code can work it out pretty quickly. I think I'm going to have to downgrade for now.

@phobologic
Copy link
Member

@ajk8 can you give 1.3 a try and let me know if you see the same issue? A lot of changes have been made around this code, so hopefully you won't be running into the issue as much any longer.

@ajk8
Copy link
ajk8 commented May 29, 2018

This appears to still be an issue in 1.3.0. I ran a destroy operation, and got no information back. Not even summaries. Just a series of stacktraces just like the one described above.

phobologic added a commit that referenced this issue Sep 26, 2018
Also, extend the way we retry/timeout, that should work around #515.

Not sure of a great way to test this unfortunately.
phobologic added a commit that referenced this issue Oct 13, 2018
* Get rid of recursion for tail retries

Also, extend the way we retry/timeout, that should work around #515.

Not sure of a great way to test this unfortunately.

* Add some tests
@phobologic
Copy link
Member

Hey @ajk8 - I know this is a long time coming, but I believe this should have been fixed in #663. Closing this out!

phrohdoh pushed a commit to phrohdoh/stacker that referenced this issue Dec 18, 2018
* Get rid of recursion for tail retries

Also, extend the way we retry/timeout, that should work around cloudtools#515.

Not sure of a great way to test this unfortunately.

* Add some tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0