[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2637dc3c-80b5-40d7-b0e1-22ccdeba848d@amd.com>
Date: Mon, 14 Apr 2025 20:48:15 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: Jan Kiszka <jan.kiszka@...mens.com>, Aaron Lu <ziqianlu@...edance.com>,
Valentin Schneider <vschneid@...hat.com>, <linux-rt-users@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>, Juri
Lelli <juri.lelli@...hat.com>, Clark Williams <williams@...hat.com>, "Luis
Claudio R. Goncalves" <lgoncalv@...hat.com>, Andreas Ziegler
<ziegler.andreas@...mens.com>, Felix Moessbauer
<felix.moessbauer@...mens.com>, Florian Bezdeka <florian.bezdeka@...mens.com>
Subject: Re: [RT BUG] Stall caused by eventpoll, rwlocks and CFS bandwidth
controller
Hello Sebastian,
On 4/14/2025 8:35 PM, Sebastian Andrzej Siewior wrote:
> On 2025-04-14 20:20:04 [+0530], K Prateek Nayak wrote:
>> Note: I could not reproduce the splat with !PREEMPT_RT kernel
>> (CONFIG_PREEMPT=y) or with small loops counts that don't exhaust the
>> cfs bandwidth.
>
> Not sure what this has to do with anything.
Let me clarify a bit more:
- Fair task with cfs_bandwidth limits triggers the prctl(666, 50000000)
- The prctl() takes a read_lock_irq() excpet on PREEMPT_RT this does not
disable interrupt.
- I take a dummy lock to stall the preemption
- Within the read_lock critical section, I queue a timer that takes the
read_lock.
- I also wakeup up a high priority RT task that that takes the
write_lock
As soon as I drop the dummy raw_spin_lock:
- High priority RT task runs, tries to take the write_lock but cannot
since the preempted fair task has the read end still.
- Next ktimerd runs trying to grab the read_lock() but is put in the
slowpath since ktimerd has tried to take the write_lock
- The fair task runs out of bandwidth and is preempted but this requires
the ktimerd to run the replenish function which is queued behind the
already preempted timer function trying to grab the read_lock()
Isn't this the scenario that Valentin's original summary describes?
If I've got something wrong please do correct me.
> On !RT the read_lock() in the timer can be acquired even with a pending
> writer. The writer keeps spinning until the main thread is gone. There
> should be no RCU boosting but the RCU still is there, too.
On !RT, the read_lock_irq() in fair task will not be preempted in the
first place so progress is guaranteed that way right?
>
> On RT the read_lock() in the timer block, the write blocks, too. So
> every blocker on the lock is scheduled out until the reader is gone. On
> top of that, the reader gets RCU boosted with FIFO-1 by default to get
> out.
Except there is a circular dependency now:
- fair task needs bandwidth replenishment to progress and drop lock.
- rt task needs fair task to drop the lock and grab the write end.
- ktimerd requires rt task to grab and drop the lock to make progress.
I'm fairly new to the PREEMPT_RT bits so if I've missed something,
please do let me know and sorry for any noise.
>
> Sebastian
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists