[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190918144138.24839-1-balasubramani_vivekanandan@mentor.com>
Date: Wed, 18 Sep 2019 16:41:37 +0200
From: Balasubramani Vivekanandan <balasubramani_vivekanandan@...tor.com>
To: <fweisbec@...il.com>, <tglx@...utronix.de>, <mingo@...nel.org>
CC: <balasubramani_vivekanandan@...tor.com>, <erosca@...adit-jv.com>,
<linux-kernel@...r.kernel.org>
Subject: [PATCH V1 0/1] tick: broadcast-hrtimer: Fix a race in bc_set_next
I was investigating a rcu stall warning on ARM64 Renesas Rcar3
platform. On analysis I found that rcu stall warning was because the
rcu_preempt kthread was starved of cpu time. rcu_preempt was blocked in
the function schedule_timeout() and never woken up. On further
investigation I found that local timer interrupts were not happening on
the cpu where the rcu_preempt kthread was blocked. So the rcu_preempt
was not woken up after timeout.
I continued my analysis to debug why the timer failed on the cpu. I
found that when cpu goes through idle state cycle, the timer failure
happens. When the cpu enters the idle state it subscribes to the tick
broadcast clock and shutsdown the local timer. Then on exit from idle
state the local timer is programmed to fire interrupts. But I found that
the during the error scenario, cpu fails to program the local timer on
exit from idle state. The below code in
__tick_broadcast_oneshot_control() is where the idle code exit path goes
through and fails to program the timer hardware
now = ktime_get();
if (dev->next_event <= now) {
cpumask_set_cpu(cpu, tick_broadcast_force_mask);
goto out;
}
The value in next_event will be earlier than current time because the
tick broadcast clock did not wake up the cpu on its subcribed
timeout. Later when the cpu is woken up due to some other event this
condition will arise. After the cpu woken up, any further timeout
requests by any task on the cpu might fail to program the timer
hardware because the value in next_event will be earlier than the
current time.
Then I focussed on why the tick broadcast clock failed to wake up the
cpu. I noticed a race condition in the hrtimer based tick broadcast
clock. The race condition results in a condition where the tick
broadcast hrtimer is never restarted. I have created a patch to fix the
race condition. Please review
Balasubramani Vivekanandan (1):
tick: broadcast-hrtimer: Fix a race in bc_set_next
kernel/time/tick-broadcast-hrtimer.c | 58 ++++++++++++++++++++++------
kernel/time/tick-broadcast.c | 2 +
2 files changed, 48 insertions(+), 12 deletions(-)
--
2.17.1
Powered by blists - more mailing lists