netdev - Re: [PATCH net-next v2 0/3] net: introduce rps_default

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6758c48d926845ae323a68fb4649fb982e2321c4.camel@redhat.com>
Date:   Mon, 30 Jan 2023 10:25:34 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Saeed Mahameed <saeed@...nel.org>, netdev@...r.kernel.org,
        Jonathan Corbet <corbet@....net>,
        "David S. Miller" <davem@...emloft.net>,
        Shuah Khan <shuah@...nel.org>, linux-doc@...r.kernel.org,
        linux-kselftest@...r.kernel.org,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH net-next v2 0/3] net: introduce rps_default_mask

Hi all,

On Wed, 2020-11-04 at 12:42 -0700, Jakub Kicinski wrote:
> On Wed, 04 Nov 2020 18:36:08 +0100 Paolo Abeni wrote:
> > On Tue, 2020-11-03 at 08:52 -0800, Jakub Kicinski wrote:
> > > On Tue, 03 Nov 2020 16:22:07 +0100 Paolo Abeni wrote:  
> > > > The relevant use case is an host running containers (with the related
> > > > orchestration tools) in a RT environment. Virtual devices (veths, ovs
> > > > ports, etc.) are created by the orchestration tools at run-time.
> > > > Critical processes are allowed to send packets/generate outgoing
> > > > network traffic - but any interrupt is moved away from the related
> > > > cores, so that usual incoming network traffic processing does not
> > > > happen there.
> > > > 
> > > > Still an xmit operation on a virtual devices may be transmitted via ovs
> > > > or veth, with the relevant forwarding operation happening in a softirq
> > > > on the same CPU originating the packet. 
> > > > 
> > > > RPS is configured (even) on such virtual devices to move away the
> > > > forwarding from the relevant CPUs.
> > > > 
> > > > As Saeed noted, such configuration could be possibly performed via some
> > > > user-space daemon monitoring network devices and network namespaces
> > > > creation. That will be anyway prone to some race: the orchestation tool
> > > > may create and enable the netns and virtual devices before the daemon
> > > > has properly set the RPS mask.
> > > > 
> > > > In the latter scenario some packet forwarding could still slip in the
> > > > relevant CPU, causing measurable latency. In all non RT scenarios the
> > > > above will be likely irrelevant, but in the RT context that is not
> > > > acceptable - e.g. it causes in real environments latency above the
> > > > defined limits, while the proposed patches avoid the issue.
> > > > 
> > > > Do you see any other simple way to avoid the above race?
> > > > 
> > > > Please let me know if the above answers your doubts,  
> > > 
> > > Thanks, that makes it clearer now.
> > > 
> > > Depending on how RT-aware your container management is it may or may not
> > > be the right place to configure this, as it creates the veth interface.
> > > Presumably it's the container management which does the placement of
> > > the tasks to cores, why is it not setting other attributes, like RPS?  
> > 
> > The container orchestration is quite complex, and I'm unsure isolation
> > and networking configuration are performed (or can be performed) by the
> > same precess (without an heavy refactor).
> > 
> > On the flip hand, the global rps mask knob looked quite
> > straightforward to me.
> 
> I understand, but I can't shake the feeling this is a hack.
> 
> Whatever sets the CPU isolation should take care of the RPS settings.

Let me try for a moment to revive this old thread.

Tha series proposed a new sysctl know to implement a global/default rps
mask applying to all the network devices as a way to simplify some RT
setups. It has been rejected as the required task is doable in user-
space.

Currently the orchestration infrastructure does that, setting the per
device, per queue rps mask and CPU isolation.

The above leads to a side problem: when there are lot of netns/devices
with several queues, even a reasonably optimized user-space solution
takes a relevant amount of time to traverse the relevant sysfs dirs and
do I/O on them. Overall the additional time required is very
measurable, easily ranging in seconds.

The default_rps_mask would basically kill that overhead.

Is the above a suitable use case?

Thanks,

Paolo