-
Notifications
You must be signed in to change notification settings - Fork 33
Race condition in phase 5 #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is there an easy fix for this? |
Not that I'm aware of. I didn't dive deeply though. |
One thing that I see in common there is the |
Maybe... Does that work reliably? |
This happened again and caused an automated test run to fail on the CRS repo the other day. So, this definitely remains an issue. |
We can't use I have an idea how to solve this though. Unfortunately, it will require replacing the current mechanism (maybe with a transition phase). The issue we are facing is that a test request will run in a thread / worker that is different from the marker requests. Hence, the marker requests may complete while processing of the test request is still ongoing. The fix thus has to be to use the same thread / worker as the test request. For this to work we will need an additional rule that logs in phase 5 of every test request. Then we know that we can look for this special log entry to find the end of the log lines for a given test request. Essentially, the end marker we currently use would be replaced with a different end marker. The start marker can still be used because there's no race condition in that case. I'm pretty sure this will work, however, I'm not sure wether a |
Ping @dune73. |
Sorry, I overlooked this so far. Unfortunately, phase 5 is indeed skipped when the drop action is used. I suspect this is because the idea with drop is to free the thread immediately from the connection. I wonder how garbage collection works in this case ... Either way, it's an unfortunate situation. But then I would not invest too much into phase 5. The only rule we use in phase 5 is 980170 (CRSv4) and it's not a very important rule. I would probably be pragmatic and test at reporting level 0 where the rule is inactive. |
I have been encountering this problem more and more frequently over the past week or so, for what it's worth. Several PRs have been marked as 'failing tests' in error. |
Thanks. I'm working on a workaround. |
|
Tests for rules in phase 5 are subject to a rare race condition, as exhibited in the following log for tests
980170-2
and980170-3
(line breaks between rules added for readability):Rule
920350
is triggered as expected for the first test but rule980170
is not. However, looking at theunique_id
field it becomes clear that rule980170
did in fact trigger but was reported as part of the second rule (themsg
field of rule999999
denotes start and stop ID of a rule).The logic for finding triggered rules in the log is:
Apparently, the log entry for rule
980170
may under some circumstances be written to the log much later and, which is worse, independently of other log entries of the same phase (both999999
and980170
run in phase 5), otherwise it would have shown up before the start marker for test980170-3
.The text was updated successfully, but these errors were encountered: