-
-
Notifications
You must be signed in to change notification settings - Fork 446
[maintenance] Use Wandalen/wretry.action to auto-retry fail in --pre t 8000 ests #7986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure how to see what a fail->re-run would look like... Edit: here's an example: looks like the logs are combined for both runs under the job and then it has says how many attempts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Great work finding it! I like this solution more than defining repeated test actions like we previously mentioned. Much easier to modify in the future!
From browsing GitHub searching for examples, the most common use case of wretry action appears to be code coverage uploads 🤣 |
Just a thought about whether this would be useful to add to comprehensive for PRs. When we got flaky tests in PRs it was really confusing for me as a new contributor. I know we've fixed much of the flakiness on comprehensive, but it wouldn't hurt to have anyways? |
I'm hesitant to auto-rerun PR tests, because the majority are real fails -- vast majority I would expect -- which then cause every job to fail, triggering a re-run. So then we will have huge extra usage of runners which will slow down our CI massively making PRs have to wait. This pre-release workflow is on our "vetted" code, runs only twice per day, and real fails are rare so I think it won't be so bad. |
oh my gosh you are 100% right. my bad |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7986 +/- ##
==========================================
+ Coverage 92.89% 92.94% +0.05%
==========================================
Files 643 647 +4
Lines 60688 60844 +156
==========================================
+ Hits 56375 56551 +176
+ Misses 4313 4293 -20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Let's let @Czaki have a look at this before merging, but I'm happy with the rationale and approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My past experience is that fast restart of actions often ends with next failure, but we can try this.
We need to increase workflow timeout to 50-60min. Otherwise
second attempt may end with timeout.
Co-authored-by: Grzegorz Bokota <bokota+github@gmail.com>
According to the docs, the default is 360 min, so we should be ok. |
According to workflow, it is 40 minutes:
|
Oh, sorry I missed that. |
References and relevant issues
Closes: #7978
Our prerelease testing seems particularly sensitive to flaky tests, so the issue is opened frequently which is then resolved by rerunning CI. Grzegorz noted that the frequency needs to be high because some packages have very short pre-release windows (#7803 (comment))
Description
This PR attempts to implement auto-rerunning of failed jobs for pre-release tests. The idea being that typically rerunning a flaky test resolves the issue, so let's bake that into the workflow 🤣
The wretry.action has a lot of other configurable settings we could use, but a simple, retry once on fail seems like a good start.