[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240520132040.259477-1-zhuqiuer1@huawei.com>
Date: Mon, 20 May 2024 21:20:40 +0800
From: <zhuqiuer1@...wei.com>
To: <anna-maria@...utronix.de>, <frederic@...nel.org>, <tglx@...utronix.de>,
<linux-kernel@...r.kernel.org>
CC: <zhuqiuer1@...wei.com>
Subject: Question: One-jiffy latency from the checking in run_local_timers()
Hi there, the function "kernel/time/timer.c:run_local_timers" avoids raising a softirq when there are no timers set to expire at the current time.
It achieves this by comparing the current "jiffies" with "base->next_expiry".
However, when working with SMP, it is possible that a few CPUs are reading the jiffies while it is being incremented.
These CPUs may read the old-jiffies value in "run_local_timers" and fail to invoke expired timers at this jiffy.
This results in a one-jiffy latency for the timers. Can we simply add 1 to the "jiffies" value when we compare it with next_expiry?
This may result in an unnecessary softirq being raised if a timer expires in the next jiffy, but can remove the one-jiffy latency.
Not sure if this is a positive trade-off.
Below is the example that we found to
have a few cpus reading the old-jiffies value while cpu-0 is updating the jiffies:
<idle>-0 [000] d.h. 133.492480: do_timer: updated_jiffies: 4294950645
<idle>-0 [010] d.h. 133.492480: run_local_timers: base->next_expiry: 5368691712, jiffies: 4294950644
<idle>-0 [001] d.h. 133.492480: run_local_timers: base->next_expiry: 4294950645, jiffies: 4294950644
...
<idle>-0 [006] d.h. 133.492481: run_local_timers: base->next_expiry: 4294967808, jiffies: 4294950645
...
We found that in this case the timer on cpu-1 was invoked in next jiffy but not the one it is expected to.
Powered by blists - more mailing lists