lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 9 Sep 2022 11:35:07 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     "Nambiar, Amritha" <amritha.nambiar@...el.com>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "jhs@...atatu.com" <jhs@...atatu.com>,
        "jiri@...nulli.us" <jiri@...nulli.us>,
        "xiyou.wangcong@...il.com" <xiyou.wangcong@...il.com>,
        "Gomes, Vinicius" <vinicius.gomes@...el.com>,
        "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Subject: Re: [net-next PATCH v2 0/4] Extend action skbedit to RX queue mapping

On Fri, Sep 9, 2022 at 2:18 AM Nambiar, Amritha
<amritha.nambiar@...el.com> wrote:
>
> > -----Original Message-----
> > From: Alexander Duyck <alexander.duyck@...il.com>
> > Sent: Thursday, September 8, 2022 8:28 AM
> > To: Nambiar, Amritha <amritha.nambiar@...el.com>
> > Cc: netdev@...r.kernel.org; kuba@...nel.org; jhs@...atatu.com;
> > jiri@...nulli.us; xiyou.wangcong@...il.com; Gomes, Vinicius
> > <vinicius.gomes@...el.com>; Samudrala, Sridhar
> > <sridhar.samudrala@...el.com>
> > Subject: Re: [net-next PATCH v2 0/4] Extend action skbedit to RX queue
> > mapping
> >
> > On Wed, Sep 7, 2022 at 6:14 PM Amritha Nambiar
> > <amritha.nambiar@...el.com> wrote:
> > >
> > > Based on the discussion on
> > > https://lore.kernel.org/netdev/20220429171717.5b0b2a81@kernel.org/,
> > > the following series extends skbedit tc action to RX queue mapping.
> > > Currently, skbedit action in tc allows overriding of transmit queue.
> > > Extending this ability of skedit action supports the selection of receive
> > > queue for incoming packets. Offloading this action is added for receive
> > > side. Enabled ice driver to offload this type of filter into the
> > > hardware for accepting packets to the device's receive queue.
> > >
> > > v2: Added documentation in Documentation/networking
> > >
> > > ---
> > >
> > > Amritha Nambiar (4):
> > >       act_skbedit: Add support for action skbedit RX queue mapping
> > >       act_skbedit: Offload skbedit queue mapping for receive queue
> > >       ice: Enable RX queue selection using skbedit action
> > >       Documentation: networking: TC queue based filtering
> >
> > I don't think skbedit is the right thing to be updating for this. In
> > the case of Tx we were using it because at the time we stored the
> > sockets Tx queue in the skb, so it made sense to edit it there if we
> > wanted to tweak things before it got to the qdisc layer. However it
> > didn't have a direct impact on the hardware and only really affected
> > the software routing in the device, which eventually resulted in which
> > hardware queue and qdisc was selected.
> >
> > The problem with editing the receive queue is that the hardware
> > offloaded case versus the software offloaded can have very different
> > behaviors. I wonder if this wouldn't be better served by being an
>
> Could you please explain how the hardware offload and software cases
> behave differently in the skbedit case. From Jakub's suggestion on
> https://lore.kernel.org/netdev/20220503084732.363b89cc@kernel.org/,
> it looked like the skbedit action fits better to align the hardware and
> software description of RX queue offload (considering the skb metadata
> remains same in offload vs no-offload case).

So specifically my concern is RPS. The problem is RPS takes place
before your TC rule would be applied in the software case, but after
it has been applied in the hardware case. As a result the behavior
will be different for one versus the other. With the redirect action
it will pull the packet out of the Rx pipeline and reinsert it so that
RPS will be applied to the packet and it would be received on the CPUs
expected.

> > extension of the mirred ingress redirect action which is already used
> > for multiple hardware offloads as I recall.
> >
> > In this case you would want to be redirecting packets received on a
> > port to being received on a specific queue on that port. By using the
> > redirect action it would take the packet out of the receive path and
> > reinsert it, being able to account for anything such as the RPS
> > configuration on the device so the behavior would be closer to what
> > the hardware offloaded behavior would be.
>
> Wouldn't this be an overkill as we only want to accept packets into a
> predetermined queue? IIUC, the mirred redirect action typically moves
> packets from one interface to another, the filter is added on interface
> different from the destination interface. In our case, with the
> destination interface being the same, I am not understanding the need
> for a loopback. Also, WRT to RPS, not sure I understand the impact
> here. In hardware, once the offloaded filter executes to select the queue,
> RSS does not run. In software, if RPS executes before
> sch_handle_ingress(), wouldn't any tc-actions (mirred redirect or skbedit
> overriding the queue) behave in similar way ?

The problem is that RPS != RSS. You can use the two together to spread
work out over a greater set of queues. So for example in a NUMA system
with multiple sockets/nodes you might use RSS to split the work up
into a per-node queue(s), and then use RPS to split up the work across
CPUs within that node. If you pick a packet up from one device and
redirect it via the mirred action the RPS is applied as though the
packet was received on the device so the RPS queue would be correct
assuming you updated the queue. That is what I am looking for. What
this patch is doing is creating a situation where the effect is very
different between the hardware and software version of things which
would likely break things for a use case such as this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ