lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0a842574fd0acc113ef925c48d2ad9e67aa0e101.camel@redhat.com>
Date: Wed, 23 Aug 2023 15:35:41 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, 
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>,  Jakub Kicinski <kuba@...nel.org>, Peter Zijlstra
 <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, Wander
 Lairson Costa <wander@...hat.com>
Subject: Re: [RFC PATCH net-next 1/2] net: Use SMP threads for backlog NAPI.

On Mon, 2023-08-14 at 11:35 +0200, Sebastian Andrzej Siewior wrote:
> @@ -4781,7 +4733,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
>  		 * We can use non atomic operation since we own the queue lock
>  		 */
>  		if (!__test_and_set_bit(NAPI_STATE_SCHED, &sd->backlog.state))
> -			napi_schedule_rps(sd);
> +			__napi_schedule_irqoff(&sd->backlog);
>  		goto enqueue;
>  	}
>  	reason = SKB_DROP_REASON_CPU_BACKLOG;

I *think* that the above could be quite dangerous when cpu ==
smp_processor_id() - that is, with plain veth usage.

Currently, each packet runs into the rx path just after
enqueue_to_backlog()/tx completes.

With this patch there will be a burst effect, where the backlog thread
will run after a few (several) packets will be enqueued, when the
process scheduler will decide - note that the current CPU is already
hosting a running process, the tx thread.

The above can cause packet drops (due to limited buffering) or very
high latency (due to long burst), even in non overload situation, quite
hard to debug.

I think the above needs to be an opt-in, but I guess that even RT
deployments doing some packet forwarding will not be happy with this
on.

Cheers,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ