lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c37223527d5b6bcf0ffce69c81f16fd0781fa2d6.camel@redhat.com>
Date: Tue, 05 Mar 2024 11:08:35 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, netdev@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>,  Jakub Kicinski <kuba@...nel.org>, Jesper Dangaard
 Brouer <hawk@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Wander
 Lairson Costa <wander@...hat.com>, Yan Zhai <yan@...udflare.com>
Subject: Re: [PATCH v3 net-next 2/4] net: Allow to use SMP threads for
 backlog NAPI.

On Wed, 2024-02-28 at 13:05 +0100, Sebastian Andrzej Siewior wrote:
> Backlog NAPI is a per-CPU NAPI struct only (with no device behind it)
> used by drivers which don't do NAPI them self, RPS and parts of the
> stack which need to avoid recursive deadlocks while processing a packet.
> 
> The non-NAPI driver use the CPU local backlog NAPI. If RPS is enabled
> then a flow for the skb is computed and based on the flow the skb can be
> enqueued on a remote CPU. Scheduling/ raising the softirq (for backlog's
> NAPI) on the remote CPU isn't trivial because the softirq is only
> scheduled on the local CPU and performed after the hardirq is done.
> In order to schedule a softirq on the remote CPU, an IPI is sent to the
> remote CPU which schedules the backlog-NAPI on the then local CPU.
> 
> On PREEMPT_RT interrupts are force-threaded. The soft interrupts are
> raised within the interrupt thread and processed after the interrupt
> handler completed still within the context of the interrupt thread. The
> softirq is handled in the context where it originated.
> 
> With force-threaded interrupts enabled, ksoftirqd is woken up if a
> softirq is raised from hardirq context. This is the case if it is raised
> from an IPI. Additionally there is a warning on PREEMPT_RT if the
> softirq is raised from the idle thread.
> This was done for two reasons:
> - With threaded interrupts the processing should happen in thread
>   context (where it originated) and ksoftirqd is the only thread for
>   this context if raised from hardirq. Using the currently running task
>   instead would "punish" a random task.
> - Once ksoftirqd is active it consumes all further softirqs until it
>   stops running. This changed recently and is no longer the case.
> 
> Instead of keeping the backlog NAPI in ksoftirqd (in force-threaded/
> PREEMPT_RT setups) I am proposing NAPI-threads for backlog.
> The "proper" setup with threaded-NAPI is not doable because the threads
> are not pinned to an individual CPU and can be modified by the user.
> Additionally a dummy network device would have to be assigned. Also
> CPU-hotplug has to be considered if additional CPUs show up.
> All this can be probably done/ solved but the smpboot-threads already
> provide this infrastructure.
> 
> Sending UDP packets over loopback expects that the packet is processed
> within the call. Delaying it by handing it over to the thread hurts
> performance. It is not beneficial to the outcome if the context switch
> happens immediately after enqueue or after a while to process a few
> packets in a batch.
> There is no need to always use the thread if the backlog NAPI is
> requested on the local CPU. This restores the loopback throuput. The
> performance drops mostly to the same value after enabling RPS on the
> loopback comparing the IPI and the tread result.
> 
> Create NAPI-threads for backlog if request during boot. The thread runs
> the inner loop from napi_threaded_poll(), the wait part is different. It
> checks for NAPI_STATE_SCHED (the backlog NAPI can not be disabled).
> 
> The NAPI threads for backlog are optional, it has to be enabled via the boot
> argument "thread_backlog_napi". It is mandatory for PREEMPT_RT to avoid the
> wakeup of ksoftirqd from the IPI.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>

Does not apply cleanly after commit 1200097fa8f0d, please rebase and
repost. Note that we are pretty close to the net-next PR, this is at
risk for this cycle.

Side note: is not 110% clear to me why the admin should want to enable
the threaded backlog for the non RT case. I read that the main
difference would be some small perf regression, could you clarify?

Thanks!

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ