lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4ed5a8cc35f34a3cb11aded76b0f289c658c1a7.camel@redhat.com>
Date:   Tue, 04 Jul 2023 12:29:33 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Wander Lairson Costa <wander@...hat.com>
Cc:     linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org,
        juri.lelli@...hat.com
Subject: Re: Splat in kernel RT while processing incoming network packets

On Tue, 2023-07-04 at 12:05 +0200, Sebastian Andrzej Siewior wrote:
> On 2023-07-03 18:15:58 [-0300], Wander Lairson Costa wrote:
> > > Not sure how to proceed. One thing you could do is a hack similar like
> > > net-Avoid-the-IPI-to-free-the.patch which does it for defer_csd.
> > 
> > At first sight it seems straightforward to implement.
> > 
> > > On the other hand we could drop net-Avoid-the-IPI-to-free-the.patch and
> > > remove the warning because we have now commit
> > >    d15121be74856 ("Revert "softirq: Let ksoftirqd do its job"")
> > 
> > But I am more in favor of a solution that removes code than one that
> > adds more :)
> 
> Raising the softirq from anonymous (hardirq context) is not ideal for
> the reasons I stated below.
> 
> > > Prior that, raising softirq from hardirq would wake ksoftirqd which in
> > > turn would collect all pending softirqs. As a consequence all following
> > > softirqs (networking, …) would run as SCHED_OTHER and compete with
> > > SCHED_OTHER tasks for resources. Not good because the networking work is
> > > no longer processed within the networking interrupt thread. Also not a
> > > DDoS kind of situation where one could want to delay processing.
> > > 
> > > With that change, this isn't the case anymore. Only an "unrelated" IRQ
> > > thread could pick up the networking work which is less then ideal. That
> > > is because the global softirq set is added, ksoftirq is marked for a
> > > wakeup and could be delayed because other tasks are busy. Then the disk
> > > interrupt (for instance) could pick it up as part of its threaded
> > > interrupt.
> > > 
> > > Now that I think about, we could make the backlog pseudo device a
> > > thread. NAPI threading enables one thread but here we would need one
> > > thread per-CPU. So it would remain kind of special. But we would avoid
> > > clobbering the global state and delay everything to ksoftird. Processing
> > > it in ksoftirqd might not be ideal from performance point of view.
> > 
> > Before sending this to the ML, I talked to Paolo about using NAPI
> > thread. He explained that it is implemented per interface. For example,
> > for this specific case, it happened on the loopback interface, which
> > doesn't implement NAPI. I am cc'ing him, so the can correct me if I am
> > saying something wrong.
> 
> It is per NAPI-queue/instance and you could have multiple instances per
> interface. However loopback has one and you need per-CPU threads if you
> want to RPS your skbs to any CPU.

Just to hopefully clarify the networking side of it, napi instances !=
network backlog (used by RPS). The network backlog (RPS) is available
for all the network devices, including the loopback and all the virtual
ones. 

The napi instances (and the threaded mode) are available only on
network device drivers implementing the napi model. The loopback driver
does not implement the napi model, as most virtual devices and even
some H/W NICs (mostily low end ones).

The network backlog can't run in threaded mode: there is no API/sysctl
nor infrastructure for that. The backlog processing threaded mode could
be implemented, even if should not be completely trivial and it sounds
a bit weird to me.


Just for the records, I mentioned the following in the bz:

It looks like flush_smp_call_function_queue() has 2 only callers,
migration, and do_idle().

What about moving softirq processing from
flush_smp_call_function_queue() into cpu_stopper_thread(), outside the
unpreemptable critical section?

I *think*/wild guess the call from do_idle() could be just removed (at
least for RT build), as according to:

commit b2a02fc43a1f40ef4eb2fb2b06357382608d4d84
Author: Peter Zijlstra <peterz@...radead.org>
Date:   Tue May 26 18:11:01 2020 +0200

	smp: Optimize send_call_function_single_ipi()

is just an optimization.

Cheers,

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ