linux-kernel - Re: [RT BUG] Stall caused by eventpoll, rwlocks and CFS bandwidth controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e2d5f785-193b-46de-9a64-5cefa5196345@siemens.com>
Date: Tue, 15 Apr 2025 12:23:53 +0200
From: Jan Kiszka <jan.kiszka@...mens.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: K Prateek Nayak <kprateek.nayak@....com>,
 Aaron Lu <ziqianlu@...edance.com>, Valentin Schneider <vschneid@...hat.com>,
 linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 Thomas Gleixner <tglx@...utronix.de>, Juri Lelli <juri.lelli@...hat.com>,
 Clark Williams <williams@...hat.com>,
 "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>,
 Andreas Ziegler <ziegler.andreas@...mens.com>,
 Felix Moessbauer <felix.moessbauer@...mens.com>,
 Florian Bezdeka <florian.bezdeka@...mens.com>
Subject: Re: [RT BUG] Stall caused by eventpoll, rwlocks and CFS bandwidth
 controller

On 15.04.25 10:00, Sebastian Andrzej Siewior wrote:
> On 2025-04-15 08:54:01 [+0200], Jan Kiszka wrote:
>> On 15.04.25 08:23, Sebastian Andrzej Siewior wrote:
>>> On 2025-04-15 07:35:50 [+0200], Jan Kiszka wrote:
>>>>> On RT the read_lock() in the timer block, the write blocks, too. So
>>>>> every blocker on the lock is scheduled out until the reader is gone. On
>>>>> top of that, the reader gets RCU boosted with FIFO-1 by default to get
>>>>> out.
>>>>
>>>> There is no boosting of the active readers on RT as there is no
>>>> information recorded about who is currently holding a read lock. This is
>>>> the whole point why rwlocks are hairy with RT, I thought.
>>>
>>> Kind of, yes. PREEMPT_RT has by default RCU boosting enabled with
>>> SCHED_FIFO 1. If you acquire a readlock you start a RCU section. If you
>>> get stuck in a RCU section for too long then this boosting will take
>>> effect by making the task, within the RCU section, the owner of the
>>> boost-lock and the boosting task will try to acquire it. This is used to
>>> get SCHED_OTHER tasks out of the RCU section.
>>> But if a SCHED_FIFO task is on the CPU then this boosting will have to
>>> no effect because the scheduler will not switch to a task with lower
>>> priority.
>>
>> Does that boosting happen to need ktimersd or ksoftirqd (which both are
>> stalling in our case)? I'm still looking for the reason why it does not
>> help in the observed stall scenarios.
> 
> Your problem is that you likely have many reader which need to get out
> first. That spinlock replacement will help. I'm not sure about the CFS
> patch referenced in the thread here.

Nope, we only have two readers, one which is scheduled out by CFS and
another one - in soft IRQ context - that is getting stuck after the
writer promoted the held lock to a write lock.

> 
> That boosting requires a RCU reader that starts the mechanism (on rcu
> unlock). But I don't think that it will help. You would also need to
> raise the priority above to the writer level (manually) and that will
> likely break other things. It is meant to unstuck SCHED_OTHER tasks and
> not boost stuck reader as a side effect. Also I am not sure how that
> works with multiple tasks.

Ok, that is likely why we don't see that coming in for helping us out.

Jan

-- 
Siemens AG, Foundational Technologies
Linux Expert Center