netdev - Re: [PATCH net-next v2 0/3] net: introduce rps_default

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <79c58e6cf23196b73887b20802daebd59fe89476.camel@redhat.com>
Date:   Wed, 04 Nov 2020 18:36:08 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Saeed Mahameed <saeed@...nel.org>, netdev@...r.kernel.org,
        Jonathan Corbet <corbet@....net>,
        "David S. Miller" <davem@...emloft.net>,
        Shuah Khan <shuah@...nel.org>, linux-doc@...r.kernel.org,
        linux-kselftest@...r.kernel.org,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH net-next v2 0/3] net: introduce rps_default_mask

On Tue, 2020-11-03 at 08:52 -0800, Jakub Kicinski wrote:
> On Tue, 03 Nov 2020 16:22:07 +0100 Paolo Abeni wrote:
> > The relevant use case is an host running containers (with the related
> > orchestration tools) in a RT environment. Virtual devices (veths, ovs
> > ports, etc.) are created by the orchestration tools at run-time.
> > Critical processes are allowed to send packets/generate outgoing
> > network traffic - but any interrupt is moved away from the related
> > cores, so that usual incoming network traffic processing does not
> > happen there.
> > 
> > Still an xmit operation on a virtual devices may be transmitted via ovs
> > or veth, with the relevant forwarding operation happening in a softirq
> > on the same CPU originating the packet. 
> > 
> > RPS is configured (even) on such virtual devices to move away the
> > forwarding from the relevant CPUs.
> > 
> > As Saeed noted, such configuration could be possibly performed via some
> > user-space daemon monitoring network devices and network namespaces
> > creation. That will be anyway prone to some race: the orchestation tool
> > may create and enable the netns and virtual devices before the daemon
> > has properly set the RPS mask.
> > 
> > In the latter scenario some packet forwarding could still slip in the
> > relevant CPU, causing measurable latency. In all non RT scenarios the
> > above will be likely irrelevant, but in the RT context that is not
> > acceptable - e.g. it causes in real environments latency above the
> > defined limits, while the proposed patches avoid the issue.
> > 
> > Do you see any other simple way to avoid the above race?
> > 
> > Please let me know if the above answers your doubts,
> 
> Thanks, that makes it clearer now.
> 
> Depending on how RT-aware your container management is it may or may not
> be the right place to configure this, as it creates the veth interface.
> Presumably it's the container management which does the placement of
> the tasks to cores, why is it not setting other attributes, like RPS?

The container orchestration is quite complex, and I'm unsure isolation
and networking configuration are performed (or can be performed) by the
same precess (without an heavy refactor).

On the flip hand, the global rps mask knob looked quite
straightforward to me.

Possibly I can reduce the amount of new code introduced by this
patchset removing some code duplication
between rps_default_mask_sysctl() and flow_limit_cpu_sysctl(). Would
that make this change more acceptable? Or should I drop this
altogether?

> Also I wonder if it would make sense to turn this knob into something
> more generic. When we arrive at the threaded NAPIs - could it make
> sense for the threads to inherit your mask as the CPUs they are allowed
> to run on?

I personally *think* this would be fine - and good. But isn't a bit
premature discussing the integration of 2 missing pieces ? :)

Thanks,

Paolo