8000 Minor improvements to single-worker-timeout support by PeteW · Pull Request #2667 · spotify/luigi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Minor improvements to single-worker-timeout support #2667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 11, 2019

Conversation

PeteW
Copy link
Contributor
@PeteW PeteW commented Mar 8, 2019

Description

PR #1860 by @daveFNbuck gives us support for worker-timeout support
in single-worker configuration.

Motivation and Context

  • Currently the documentation states that worker-timeout setting
    only applies when using multiple workers. This is no longer true
    after PR Enables worker timeouts with only one worker #1860. This PR updates the documentation to clarify.
  • Add test to ensure/prove timeout feature works without
    multiple workers
  • Currently, for single-worker setups, the error message
    related to timeout is missing the actual timeout seconds
    in the error message.
Task TimeoutThing__99914b932b timed out after None seconds and was terminated.

The issue is None seconds is invalid. The PR addresses that issue
by ensuring worker_timeout is a property of TaskProcess.

Have you tested this? If so, how?

I have tested this on my own workflows in python 2/3 environments.
Also I've locally executed the tests as outlined in the contributing guideline.

Adjustments on top of PR spotify#1860
- Update documentation to clarify that worker-timeout
  configuration is supported for single workers
- Add test to ensure/prove timeout feature works without
  setting the num-workers setting
- Currently, for single-worker setups, the error message
  related to timeout is missing the actual timeout seconds
  in the error message. These changes improve the error
  message such that the logs properly display the timeout
Copy link
Collaborator
@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Much thanks for updating documentation and adding test to prove it!

Copy link
Contributor
@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Nice PR! Did you check that the logging still looks okay for multiple workers?

@PeteW
Copy link
Contributor Author
PeteW commented Mar 10, 2019

@Tarrasch This is an excellent question and I should have included this in the PR.
Please inspect my proof below:

(env) [pw@localhost luigitimeouts]$ cat timeoutexp.py
import luigi
import time

class TimeoutMultipleWorkers(luigi.WrapperTask):
    def requires(self):
        yield TimeoutThingA()
        yield TimeoutThingB()


class TimeoutThingA(luigi.Task):
    def output(self):
        return luigi.LocalTarget(self.task_id)

    def run(self):
        time.sleep(5)
        print "OK! Im awake now!"
        self.output().open().write()


class TimeoutThingB(luigi.Task):
    def output(self):
        return luigi.LocalTarget(self.task_id)

    def run(self):
        time.sleep(5)
        print "OK! Im awake now!"
        self.output().open().write()

and running with multiple workers:

(env) [pw@localhost luigitimeouts]$ luigi --module timeoutexp TimeoutMultipleWorkers --worker-timeout=3 --workers=2                                                         [30/1656]

DEBUG: Checking if TimeoutMultipleWorkers() is complete
DEBUG: Checking if TimeoutThingA() is complete
DEBUG: Checking if TimeoutThingB() is complete
INFO: Informed scheduler that task   TimeoutMultipleWorkers__99914b932b   has status   PENDING
INFO: Informed scheduler that task   TimeoutThingB__99914b932b   has status   PENDING
INFO: Informed scheduler that task   TimeoutThingA__99914b932b   has status   PENDING
INFO: Done scheduling tasks
INFO: Running Worker with 2 processes
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 3
DEBUG: Asking scheduler for work...
INFO: [pid 10893] Worker Worker(salt=576581956, workers=2, host=localhost.localdomain, username=pw, pid=10885) running   TimeoutThingA()
DEBUG: Pending tasks: 2
DEBUG: 2 running tasks, waiting for next task to finish
INFO: [pid 10894] Worker Worker(salt=576581956, workers=2, host=localhost.localdomain, username=pw, pid=10885) running   TimeoutThingB()
DEBUG: 2 running tasks, waiting for next task to finish
DEBUG: 2 running tasks, waiting for next task to finish
DEBUG: 2 running tasks, waiting for next task to finish
INFO: Task TimeoutThingA__99914b932b timed out after 3 seconds and was terminated.
INFO: Task TimeoutThingB__99914b932b timed out after 3 seconds and was terminated.
INFO: Informed scheduler that task   TimeoutThingA__99914b932b   has status   FAILED
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
DEBUG: TimeoutThingB__99914b932b is currently run by worker Worker(salt=576581956, workers=2, host=localhost.localdomain, username=pw, pid=10885)
INFO: Task TimeoutThingB__99914b932b died unexpectedly with exit code -15
INFO: Informed scheduler that task   TimeoutThingB__99914b932b   has status   FAILED
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
DEBUG: There are 3 pending tasks possibly being run by other workers
DEBUG: There are 3 pending tasks last scheduled by this worker
INFO: Worker Worker(salt=576581956, workers=2, host=localhost.localdomain, username=pw, pid=10885) was stopped. Shutting down Keep-Alive thread
INFO:
===== Luigi Execution Summary =====

Scheduled 3 tasks of which:
* 2 failed:
    - 1 TimeoutThingA()
    - 1 TimeoutThingB()
* 1 were left pending, among these:
    * 1 had failed dependencies:
        - 1 TimeoutMultipleWorkers()

This progress looks :( because there were failed tasks

===== Luigi Execution Summary =====

@Tarrasch Tarrasch merged commit af6cf07 into spotify:master Mar 11, 2019
@Tarrasch
Copy link
Contributor

Thanks for your contribution!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0