[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmhy1lwifi1.mognet@vschneid.remote.csb>
Date: Wed, 10 May 2023 12:37:42 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
linux-rt-users@...r.kernel.org,
Steven Rostedt <rostedt@...dmis.org>,
Juri Lelli <juri.lelli@...hat.com>
Subject: Re: [ANNOUNCE] v6.3.1-rt13
On 09/05/23 18:46, Sebastian Andrzej Siewior wrote:
> Dear RT folks!
>
> I'm pleased to announce the v6.3.1-rt13 patch set.
>
> Changes since v6.3.1-rt12:
>
> - Two posix-timers picked-up from the list. They are scheduled for
> upstream inclusion. One of prevents a livelock on PREEMPT_RT in
> itimer_delete(). Patches by Thomas Gleixner.
>
> - A softirq handling patch from the list 'revert: "softirq: Let
> ksoftirqd do its job' from Paolo Abeni. This revert should reduce a
> lot of trouble which start once ksoftirqd is woken up.
> The 6.1-RT series has the ktimersd thread which mitigates some of
> the pain. This patch should render the patch obsolete.
> Should everything work out as expected I intend to backport this
> patch the earlier RT series and revert the ktimersd patch in the
> v6.1 series.
The ktimersd threads solved some priority inversion problem we were seeing,
IIRC it looked something like so:
- GP kthread is waiting on swait_event_idle_timeout_exclusive(...)
- p0 (CFS NICE0) did spin_lock(L) then got throttled by CFS bandwidth
- p1 (CFS NICE0) did local_bh_disable() + did spin_lock(L)
So p0 owns L, but cannot get bandwidth replenished since local softirqs are
disabled, and the GP kthread can't be woken up by timeout to initiate
boosting either.
Even if ksoftirqd has its priority tuned to ensure timers can be expired,
the above never wakes ksoftirqd due to:
static inline bool should_wake_ksoftirqd(void)
{
return !this_cpu_read(softirq_ctrl.cnt);
}
on the other hand, ktimersd are woken up unconditionally, so in this
scenario it gets to run and donate its priority via
ksoftirqd_run_begin()
`\
local_lock(&softirq_ctrl.lock)
(note that this only solves the CFS bandwidth issue if ktimersd are FIFO or
above, but they are spawned as FIFO1)
TL;DR: for RT, I think we should also kill should_wake_ksoftirqd()
Powered by blists - more mailing lists