[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f320d90db90df2d9583a1af4d83880f052768a64.camel@siemens.com>
Date: Tue, 22 Apr 2025 16:54:28 +0200
From: Florian Bezdeka <florian.bezdeka@...mens.com>
To: K Prateek Nayak <kprateek.nayak@....com>, Aaron Lu
<ziqianlu@...edance.com>
Cc: Jan Kiszka <jan.kiszka@...mens.com>, Valentin Schneider
<vschneid@...hat.com>, Ben Segall <bsegall@...gle.com>, Peter Zijlstra
<peterz@...radead.org>, Josh Don <joshdon@...gle.com>, Ingo Molnar
<mingo@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, Xi Wang
<xii@...gle.com>, linux-kernel@...r.kernel.org, Juri Lelli
<juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Mel Gorman <mgorman@...e.de>,
Chengming Zhou <chengming.zhou@...ux.dev>, Chuyi Zhou
<zhouchuyi@...edance.com>, "Sebastian Andrzej Siewior,"
<bigeasy@...utronix.de>
Subject: Re: [RFC PATCH v2 0/7] Defer throttle when task exits to user
On Tue, 2025-04-22 at 08:24 +0530, K Prateek Nayak wrote:
> Hello Aaron,
>
> On 4/22/2025 7:40 AM, Aaron Lu wrote:
> > > anon_pipe_write()
> > > __wake_up_common()
> > > ep_poll_callback() {
> > > read_lock_irq(&ep->lock) /* Read lock acquired here */
> > I was confused by this function's name. I had thought irq is off but
> > then realized under PREEMPT_RT, read_lock_irq() doesn't disable irq...
>
> Yup! Most of the interrupt handlers are run by the IRQ threads on
> PREEMPT_RT and the ones that do run in the interrupt context have all
> been adapted to use non-blocking locks whose *_irq variants disables
> interrupts on PREEMPT_RT too.
>
> >
> > > __wake_up_common()
> > > ep_autoremove_wake_function()
> > > try_to_wake_up() /* Wakes up "epoll-stall" */
> > > preempt_schedule()
> > > ...
> > >
> > > # "epoll-stall-writer" has run out of bandwidth, needs replenish to run
> > Luckily in this "only throttle when ret2user" model, epoll-stall-writer
> > does not need replenish to run again(and then unblock the others).
>
> I can confirm that throttle deferral solves this issue. I have run Jan's
> reproducer for a long time without seeing any hangs on your series. I
> hope Florian can confirm the same.
>
Partially, yes.
First, let me clarify what I am testing: I'm testing with PREEMPT_RT
enabled, as that is the setup that makes problems in the field. For
those setups it's not a performance/jitter optimization it's a critical
fix. The system locks up completely.
I ported the series to 6.14. Background was stability and the
possibility to replace one of the devices in the field with a patched
version. We do not trust anything newer yet.
The test results: 6.14 + backport is still running fine for ~10 days
now on a system where the reproducer (that Jan posted already) crashed
a unpatched 6.14 in a couple of minutes. Success.
But: I also started a test with 6.14 vanilla (so unpatched) on a
different system. This one crashes within a couple of minutes. This is
a completely different story - as the series we're discussing here is
not even applied - but to be complete, this is the last message we get
from the device:
The device is completely locked up afterwards. PID 34 is ktimers on
CPU1.
kernel: ------------[ cut here ]------------
kernel: !se->on_rq
kernel: WARNING: CPU: 1 PID: 34 at kernel/sched/fair.c:699 update_entity_lag+0x7d/0x90
kernel: Modules linked in: veth xt_nat nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfr>
kernel: sd_mod mptspi ata_generic mptscsih mptbase psmouse scsi_transport_spi ata_piix libata scs>
kernel: CPU: 1 UID: 0 PID: 34 Comm: ktimers/1 Not tainted 6.14.0 #1
kernel: Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.242>
kernel: RIP: 0010:update_entity_lag+0x7d/0x90
kernel: Code: 0f 4d d7 48 89 53 78 5b 5d c3 cc cc cc cc 80 3d e7 f4 dd 01 00 75 a9 48 c7 c7 d0 81 >
kernel: RSP: 0018:ffffacf58012fbe8 EFLAGS: 00010082
kernel: RAX: 0000000000000000 RBX: ffff9ee43ca00080 RCX: 0000000000000027
kernel: RDX: ffff9ee6efd21988 RSI: 0000000000000001 RDI: ffff9ee6efd21980
kernel: RBP: ffff9ee421929800 R08: 00000000462951bd R09: ffffffff8e654811
kernel: R10: ffffffff8e654811 R11: ffffffff8e608a2a R12: 000000000000000e
kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 000000000000000e
kernel: FS: 0000000000000000(0000) GS:ffff9ee6efd00000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 000000c00082a000 CR3: 0000000113416002 CR4: 00000000007706f0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel: <TASK>
kernel: ? __warn+0x91/0x190
kernel: ? update_entity_lag+0x7d/0x90
kernel: ? report_bug+0x164/0x190
kernel: ? handle_bug+0x58/0x90
kernel: ? exc_invalid_op+0x17/0x70
kernel: ? asm_exc_invalid_op+0x1a/0x20
kernel: ? ret_from_fork_asm+0x1a/0x30
kernel: ? ret_from_fork+0x31/0x50
kernel: ? ret_from_fork+0x31/0x50
kernel: ? update_entity_lag+0x7d/0x90
kernel: ? update_entity_lag+0x7d/0x90
kernel: dequeue_entity+0x90/0x5a0
kernel: dequeue_entities+0x121/0x640
kernel: dequeue_task_fair+0xbf/0x290
kernel: rt_mutex_setprio+0x37c/0x690
kernel: rtlock_slowlock_locked+0xca1/0x1860
kernel: ? lock_acquire+0xcb/0x2e0
kernel: ? run_ktimerd+0xe/0x80
kernel: ? __pfx_smpboot_thread_fn+0x10/0x10
kernel: rt_spin_lock+0x86/0x160
kernel: __local_bh_disable_ip+0x9d/0x190
kernel: ksoftirqd_run_begin+0xe/0x30
kernel: run_ktimerd+0xe/0x80
kernel: smpboot_thread_fn+0xda/0x1d0
Powered by blists - more mailing lists