8000 vine: make throttling worker transfer a tunable parameter · Issue #4101 · cooperative-computing-lab/cctools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

vine: make throttling worker transfer a tunable parameter #4101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JinZhou5042 opened this issue Apr 1, 2025 · 5 comments
Open

vine: make throttling worker transfer a tunable parameter #4101

JinZhou5042 opened this issue Apr 1, 2025 · 5 comments

Comments

@JinZhou5042
Copy link
Member

In @vine_current_transfers.c:set_throttle (link), we check whether the number of consecutive failures exceeds a threshold or if total failures outweigh successes. If either condition is met, we assume the worker is misbehaving and remove it.

Image

However, with file pruning enabled, workers transferring files may receive excessively frequent pruning requests, leading to a lot of transfer failures returning back to the manager, but these are as expected.

Therefore, I propose making the throttling behavior a tunable parameter, as it might be unnecessary in this scenario.

@btovar
Copy link
Member
btovar commented Apr 1, 2025

Would it make sense to instead ignore transfer failures if they correspond to pruned files? If the correct behavior depends on finely tuning a parameter then we need to reconsider the approach.

@JinZhou5042
Copy link
Member Author

In my end most of the pruning errors fell into this line (with fixes in #4099):

Image

The root cause was pruning, but the manager received an 'unknown error' message, which it can't really figure out the error type

@JinZhou5042
Copy link
Member Author

I previously didn't observe this error because the worker immediately crashes under that condition, only one error message was sent back.

@btovar
Copy link
Member
btovar commented Apr 1, 2025

Got it! In general we shouldn't add a parameter to make something work. The general rule is to add a parameter for something that works acceptably well to make it work really well once we know something about the workflow. We do not want to add a parameter that would lead us to ignore a problem.

@JinZhou5042
Copy link
Member Author

That's a good rule to remember! Sounds like we need to find a proper solution, not a workaround!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants
0