linux-kernel - Re: Splat in kernel RT while processing incoming network packets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230704100527.75hMNZ35@linutronix.de>
Date:   Tue, 4 Jul 2023 12:05:27 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Wander Lairson Costa <wander@...hat.com>
Cc:     Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org,
        linux-rt-users@...r.kernel.org, juri.lelli@...hat.com
Subject: Re: Splat in kernel RT while processing incoming network packets

On 2023-07-03 18:15:58 [-0300], Wander Lairson Costa wrote:
> > Not sure how to proceed. One thing you could do is a hack similar like
> > net-Avoid-the-IPI-to-free-the.patch which does it for defer_csd.
> 
> At first sight it seems straightforward to implement.
> 
> > On the other hand we could drop net-Avoid-the-IPI-to-free-the.patch and
> > remove the warning because we have now commit
> >    d15121be74856 ("Revert "softirq: Let ksoftirqd do its job"")
> 
> But I am more in favor of a solution that removes code than one that
> adds more :)

Raising the softirq from anonymous (hardirq context) is not ideal for
the reasons I stated below.

> > Prior that, raising softirq from hardirq would wake ksoftirqd which in
> > turn would collect all pending softirqs. As a consequence all following
> > softirqs (networking, …) would run as SCHED_OTHER and compete with
> > SCHED_OTHER tasks for resources. Not good because the networking work is
> > no longer processed within the networking interrupt thread. Also not a
> > DDoS kind of situation where one could want to delay processing.
> > 
> > With that change, this isn't the case anymore. Only an "unrelated" IRQ
> > thread could pick up the networking work which is less then ideal. That
> > is because the global softirq set is added, ksoftirq is marked for a
> > wakeup and could be delayed because other tasks are busy. Then the disk
> > interrupt (for instance) could pick it up as part of its threaded
> > interrupt.
> > 
> > Now that I think about, we could make the backlog pseudo device a
> > thread. NAPI threading enables one thread but here we would need one
> > thread per-CPU. So it would remain kind of special. But we would avoid
> > clobbering the global state and delay everything to ksoftird. Processing
> > it in ksoftirqd might not be ideal from performance point of view.
> 
> Before sending this to the ML, I talked to Paolo about using NAPI
> thread. He explained that it is implemented per interface. For example,
> for this specific case, it happened on the loopback interface, which
> doesn't implement NAPI. I am cc'ing him, so the can correct me if I am
> saying something wrong.

It is per NAPI-queue/instance and you could have multiple instances per
interface. However loopback has one and you need per-CPU threads if you
want to RPS your skbs to any CPU.

We could just remove the warning but then your RPS processes the skbs in
SCHED_OTHER. This might not be what you want. Maybe Paolo has a better
idea.

> > > Cheers,
> > > Wander

Sebastian