lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 24 Mar 2023 20:19:51 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Felix Fietkau <nbd@....name>
Cc:     netdev@...r.kernel.org, Jonathan Corbet <corbet@....net>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Paolo Abeni <pabeni@...hat.com>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next] net/core: add optional threading for backlog
 processing

On Fri, 24 Mar 2023 18:57:03 +0100 Felix Fietkau wrote:
> >> It can basically be used to make RPS a bit more dynamic and 
> >> configurable, because you can assign multiple backlog threads to a set 
> >> of CPUs and selectively steer packets from specific devices / rx queues   
> > 
> > Can you give an example?
> > 
> > With the 4 CPU example, in case 2 queues are very busy - you're trying
> > to make sure that the RPS does not end up landing on the same CPU as
> > the other busy queue?  
> 
> In this part I'm thinking about bigger systems where you want to have a
> group of CPUs dedicated to dealing with network traffic without
> assigning a fixed function (e.g. NAPI processing or RPS target) to each
> one, allowing for more dynamic processing.

I tried the threaded NAPI on larger systems and helped others try,
and so far it's not been beneficial :( Even the load balancing
improvements are not significant enough to use it, and there 
is a large risk of scheduler making the wrong decision.

Hence my questioning - I'm trying to understand what you're doing
differently.

> >> to them and allow the scheduler to take care of the rest.  
> > 
> > You trust the scheduler much more than I do, I think :)  
> 
> In my tests it brings down latency (both avg and p99) considerably in
> some cases. I posted some numbers here:
> https://lore.kernel.org/netdev/e317d5bc-cc26-8b1b-ca4b-66b5328683c4@nbd.name/

Could you provide the full configuration for this test?
In non-threaded mode the RPS is enabled to spread over remaining 
3 cores?

Powered by blists - more mailing lists