-
Notifications
You must be signed in to change notification settings - Fork 24
drivers:timer:jh7110: prevent potential race #403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've managed to trigger a potential scenario when this could happen (I say could, cause IRQ might have been before the first read of
the print statement is added here: diff --git a/drivers/timer/jh7110/timer.c b/drivers/timer/jh7110/timer.c
index 321e8b68..80ae9a89 100644
--- a/drivers/timer/jh7110/timer.c
+++ b/drivers/timer/jh7110/timer.c
@@ -88,6 +88,7 @@ static uint64_t get_ticks_in_ns(void)
if (counter_regs->intclr == 1) {
value_h += 1;
value_l = (uint64_t)(STARFIVE_TIMER_MAX_TICKS - counter_regs->value);
+ sddf_dprintf("potential race condition avoided!\n");
}
uint64_t value_ticks = (value_h << 32) | value_l; |
Good find, thank you for the detailed write-up. It would be good to include some kind of assert/check in the example that triggers the race condition in case we have it in any future drivers. |
As mentioned when we talked, it might be difficult to trigger it intentionally. It requires a 32-bit counter to overflow, and depending on device's clock speed it could take a significant amount of time to happen (for this driver ~5min, for the zcu102 driver with 100MHz clock speed - ~42s) and there is very little chance that the race actually happens, had this happen once in ~15-20 overflows with constant 1us timeouts I suggest we either have a separate test to try and catch it, or have design documentation for this 32-bit counter style drivers? |
drivers/timer/jh7110/timer.c
Outdated
@@ -87,6 +87,7 @@ static uint64_t get_ticks_in_ns(void) | |||
/* Include unhandled interrupt in value_h */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Include unhandled interrupt in value_h */ | |
/* Account for pending counter IRQ */ |
The following interleaving would have resulted in a wrong time reading, when processing timeouts off by up to 2^32-1 ticks in the future (potentially triggering mutiple timeouts falsely): 1. IRQ for TIMEOUT counter 2. enter `get_ticks_in_ns()` 4. store `value_l` as value from the other counter (not the TIMEOUT one) register 3. store`value_h` as read from `counter_timer_elapses` global 5. **!! IRQ of OVERFLOW for the counter !! (pending, not handled)** 6. check if IRQ pending for OVERFLOW counter 1. yes -> increment `value_h` . But `value_l` read before overflow! 7. now new `value_h` and old `value_l` give a value in the future, off by up to $2^{32}-1$ ticks! Signed-off-by: Szymon Duchniewicz <s.duchniewicz@unsw.edu.au> Signed-off-by: Ivan-Velickovic <i.velickovic@unsw.edu.au>
d12d29c
to
1989003
Compare
While writing the zcu102 timer driver (also using 2 32-bit counters) I noticed that we might have a race condition in this driver. The following interleaving would have resulted in a wrong time reading, when processing timeouts off by up to 2^32-1 ticks in the future (potentially triggering multiple timeouts falsely):
get_ticks_in_ns()
value_l
as value from the other counter (not the TIMEOUT one) registervalue_h
as read fromcounter_timer_elapses
globalvalue_h
. Butvalue_l
read before overflow!value_h
and oldvalue_l
give a value in the future, off by up toAfter the fix, the same scenario does not have an issue:
get_ticks_in_ns()
value_l
as value from the other counter (not the TIMEOUT one) registervalue_h
as read fromcounter_timer_elapses
globalvalue_h
. **UPDATEvalue_l
with most recent reading from the overflowed counter.value_h
and newvalue_l
give a correct value for timeI have not yet checked other timer drivers, but pretty sure they all use 64-bit counters so we should be fine.