[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmh5xbyqas1.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Wed, 29 Oct 2025 11:09:50 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, rcu@...r.kernel.org,
x86@...nel.org, linux-arm-kernel@...ts.infradead.org,
loongarch@...ts.linux.dev, linux-riscv@...ts.infradead.org,
linux-arch@...r.kernel.org, linux-trace-kernel@...r.kernel.org, Nicolas
Saenz Julienne <nsaenzju@...hat.com>, Thomas Gleixner
<tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov
<bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter
Anvin" <hpa@...or.com>, Andy Lutomirski <luto@...nel.org>, Peter Zijlstra
<peterz@...radead.org>, Arnaldo Carvalho de Melo <acme@...nel.org>, Josh
Poimboeuf <jpoimboe@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>, Arnd
Bergmann <arnd@...db.de>, "Paul E. McKenney" <paulmck@...nel.org>, Jason
Baron <jbaron@...mai.com>, Steven Rostedt <rostedt@...dmis.org>, Ard
Biesheuvel <ardb@...nel.org>, Sami Tolvanen <samitolvanen@...gle.com>,
"David S. Miller" <davem@...emloft.net>, Neeraj Upadhyay
<neeraj.upadhyay@...nel.org>, Joel Fernandes <joelagnelf@...dia.com>, Josh
Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>,
Uladzislau Rezki <urezki@...il.com>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, Mel Gorman <mgorman@...e.de>, Andrew
Morton <akpm@...ux-foundation.org>, Masahiro Yamada
<masahiroy@...nel.org>, Han Shen <shenhan@...gle.com>, Rik van Riel
<riel@...riel.com>, Jann Horn <jannh@...gle.com>, Dan Carpenter
<dan.carpenter@...aro.org>, Oleg Nesterov <oleg@...hat.com>, Juri Lelli
<juri.lelli@...hat.com>, Clark Williams <williams@...hat.com>, Yair
Podemsky <ypodemsk@...hat.com>, Marcelo Tosatti <mtosatti@...hat.com>,
Daniel Wagner <dwagner@...e.de>, Petr Tesarik <ptesarik@...e.com>
Subject: Re: [PATCH v6 23/29] context-tracking: Introduce work deferral
infrastructure
On 28/10/25 15:00, Frederic Weisbecker wrote:
> Le Fri, Oct 10, 2025 at 05:38:33PM +0200, Valentin Schneider a écrit :
>> + old = atomic_read(&ct->state);
>> +
>> + /*
>> + * The work bit must only be set if the target CPU is not executing
>> + * in kernelspace.
>> + * CT_RCU_WATCHING is used as a proxy for that - if the bit is set, we
>> + * know for sure the CPU is executing in the kernel whether that be in
>> + * NMI, IRQ or process context.
>> + * Set CT_RCU_WATCHING here and let the cmpxchg do the check for us;
>> + * the state could change between the atomic_read() and the cmpxchg().
>> + */
>> + old |= CT_RCU_WATCHING;
>
> Most of the time, the task should be either idle or in userspace. I'm still not
> sure why you start with a bet that the CPU is in the kernel with RCU watching.
>
Right I think I got that the wrong way around when I switched to using
CT_RCU_WATCHING vs CT_STATE_KERNEL. That wants to be
old &= ~CT_RCU_WATCHING;
i.e. bet the CPU is NOHZ-idle, if it's not the cmpxchg fails and we don't
store the work bit.
>> + /*
>> + * Try setting the work until either
>> + * - the target CPU has entered kernelspace
>> + * - the work has been set
>> + */
>> + do {
>> + ret = atomic_try_cmpxchg(&ct->state, &old, old | (work << CT_WORK_START));
>> + } while (!ret && !(old & CT_RCU_WATCHING));
>
> So this applies blindly to idle as well, right? It should work but note that
> idle entry code before RCU watches is also fragile.
>
Yeah I remember losing some hair trying to grok the idle entry situation;
we could keep this purely NOHZ_FULL and have the deferral condition be:
(ct->state & CT_STATE_USER) && !(ct->state & CT_RCU_WATCHING)
Powered by blists - more mailing lists