lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260107105137.4cf9a67e@mordecai>
Date: Wed, 7 Jan 2026 10:51:37 +0100
From: Petr Tesarik <ptesarik@...e.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Masami Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, Sebastian Andrzej Siewior
 <bigeasy@...utronix.de>, Clark Williams <clrkwllms@...nel.org>,
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH] ring-buffer: Use a housekeeping CPU to wake up waiters

On Wed, 7 Jan 2026 08:50:09 +0100
Petr Tesarik <ptesarik@...e.com> wrote:

> On Tue, 6 Jan 2026 17:04:05 -0500
> Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> > On Tue,  6 Jan 2026 10:10:39 +0100
> > Petr Tesarik <ptesarik@...e.com> wrote:
> >   
> > > Avoid running the wakeup irq_work on an isolated CPU. Since the wakeup can
> > > run on any CPU, let's pick a housekeeping CPU to do the job.
> > > 
> > > This change reduces additional noise when tracing isolated CPUs. For
> > > example, the following ipi_send_cpu stack trace was captured with
> > > nohz_full=2 on the isolated CPU:
> > > 
> > >           <idle>-0       [002] d.h4.  1255.379293: ipi_send_cpu: cpu=2 callsite=irq_work_queue+0x2d/0x50 callback=rb_wake_up_waiters+0x0/0x80
> > >           <idle>-0       [002] d.h4.  1255.379329: <stack trace>    
> > >  => trace_event_raw_event_ipi_send_cpu
> > >  => __irq_work_queue_local
> > >  => irq_work_queue
> > >  => ring_buffer_unlock_commit
> > >  => trace_buffer_unlock_commit_regs
> > >  => trace_event_buffer_commit
> > >  => trace_event_raw_event_x86_irq_vector
> > >  => __sysvec_apic_timer_interrupt
> > >  => sysvec_apic_timer_interrupt
> > >  => asm_sysvec_apic_timer_interrupt
> > >  => pv_native_safe_halt
> > >  => default_idle
> > >  => default_idle_call
> > >  => do_idle
> > >  => cpu_startup_entry
> > >  => start_secondary
> > >  => common_startup_64      
> > 
> > I take it that even with this patch you would still get the above events.
> > The only difference would be the "cpu=" in the event info will not be the
> > same as the CPU it executed on, right?  
> 
> Yes, this is trace of a similar event after applying the patch:
> 
>           <idle>-0       [002] d.h4.   313.334367: ipi_send_cpu: cpu=1 callsite=irq_work_queue_on+0x55/0x90 callback=generic_smp_call_function_single_interrupt+0x0/0x20
>           <idle>-0       [002] d.h4.   313.334390: <stack trace>
>  => trace_event_raw_event_ipi_send_cpu
>  => __smp_call_single_queue
>  => irq_work_queue_on
>  => ring_buffer_unlock_commit
>  => trace_buffer_unlock_commit_regs
>  => trace_event_buffer_commit
>  => trace_event_raw_event_x86_irq_vector
>  => __sysvec_apic_timer_interrupt
>  => sysvec_apic_timer_interrupt
>  => asm_sysvec_apic_timer_interrupt
>  => pv_native_safe_halt
>  => default_idle
>  => default_idle_call
>  => do_idle
>  => cpu_startup_entry
>  => start_secondary
>  => common_startup_64  
> 
> The callback function in the trace event is different. That's because
> send_call_function_single_ipi() always uses this value. Maybe it can be
> improved, and I can look into it, but that's clearly a very separate
> issue.

Erm. It's actually good I had a look. :-(

A helpful comment in irq_work_queue_on() explains that "arch remote IPI
send/receive backend aren't NMI safe". That's something I wasn't aware
of, and I'm afraid it's the end of story. The comment is followed by a
WARN_ON_ONCE(in_nmi()), and I can easily trigger it with "perf top"
while nmi:nmi_handler is traced.

Please, remove the patch again. I'm sorry.

Petr T

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ