[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR11MB1293DE4CD2757714FC469DECF1499@MWHPR11MB1293.namprd11.prod.outlook.com>
Date: Thu, 15 Sep 2022 22:09:42 +0000
From: "Nambiar, Amritha" <amritha.nambiar@...el.com>
To: Alexander Duyck <alexander.duyck@...il.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kuba@...nel.org" <kuba@...nel.org>,
"jhs@...atatu.com" <jhs@...atatu.com>,
"jiri@...nulli.us" <jiri@...nulli.us>,
"xiyou.wangcong@...il.com" <xiyou.wangcong@...il.com>,
"Gomes, Vinicius" <vinicius.gomes@...el.com>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Subject: RE: [net-next PATCH v2 0/4] Extend action skbedit to RX queue mapping
> -----Original Message-----
> From: Alexander Duyck <alexander.duyck@...il.com>
> Sent: Thursday, September 15, 2022 8:21 AM
> To: Nambiar, Amritha <amritha.nambiar@...el.com>
> Cc: netdev@...r.kernel.org; kuba@...nel.org; jhs@...atatu.com;
> jiri@...nulli.us; xiyou.wangcong@...il.com; Gomes, Vinicius
> <vinicius.gomes@...el.com>; Samudrala, Sridhar
> <sridhar.samudrala@...el.com>
> Subject: Re: [net-next PATCH v2 0/4] Extend action skbedit to RX queue
> mapping
>
> On Thu, Sep 15, 2022 at 1:45 AM Nambiar, Amritha
> <amritha.nambiar@...el.com> wrote:
> >
> > > -----Original Message-----
> > > From: Alexander Duyck <alexander.duyck@...il.com>
> > > Sent: Friday, September 9, 2022 11:35 AM
> > > To: Nambiar, Amritha <amritha.nambiar@...el.com>
> > > Cc: netdev@...r.kernel.org; kuba@...nel.org; jhs@...atatu.com;
> > > jiri@...nulli.us; xiyou.wangcong@...il.com; Gomes, Vinicius
> > > <vinicius.gomes@...el.com>; Samudrala, Sridhar
> > > <sridhar.samudrala@...el.com>
> > > Subject: Re: [net-next PATCH v2 0/4] Extend action skbedit to RX queue
> > > mapping
> > >
> > > On Fri, Sep 9, 2022 at 2:18 AM Nambiar, Amritha
> > > <amritha.nambiar@...el.com> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Alexander Duyck <alexander.duyck@...il.com>
> > > > > Sent: Thursday, September 8, 2022 8:28 AM
> > > > > To: Nambiar, Amritha <amritha.nambiar@...el.com>
> > > > > Cc: netdev@...r.kernel.org; kuba@...nel.org; jhs@...atatu.com;
> > > > > jiri@...nulli.us; xiyou.wangcong@...il.com; Gomes, Vinicius
> > > > > <vinicius.gomes@...el.com>; Samudrala, Sridhar
> > > > > <sridhar.samudrala@...el.com>
> > > > > Subject: Re: [net-next PATCH v2 0/4] Extend action skbedit to RX
> queue
> > > > > mapping
> > > > >
> > > > > On Wed, Sep 7, 2022 at 6:14 PM Amritha Nambiar
> > > > > <amritha.nambiar@...el.com> wrote:
> > > > > >
> > > > > > Based on the discussion on
> > > > > >
> > > https://lore.kernel.org/netdev/20220429171717.5b0b2a81@kernel.org/,
> > > > > > the following series extends skbedit tc action to RX queue mapping.
> > > > > > Currently, skbedit action in tc allows overriding of transmit queue.
> > > > > > Extending this ability of skedit action supports the selection of
> receive
> > > > > > queue for incoming packets. Offloading this action is added for
> receive
> > > > > > side. Enabled ice driver to offload this type of filter into the
> > > > > > hardware for accepting packets to the device's receive queue.
> > > > > >
> > > > > > v2: Added documentation in Documentation/networking
> > > > > >
> > > > > > ---
> > > > > >
> > > > > > Amritha Nambiar (4):
> > > > > > act_skbedit: Add support for action skbedit RX queue mapping
> > > > > > act_skbedit: Offload skbedit queue mapping for receive queue
> > > > > > ice: Enable RX queue selection using skbedit action
> > > > > > Documentation: networking: TC queue based filtering
> > > > >
> > > > > I don't think skbedit is the right thing to be updating for this. In
> > > > > the case of Tx we were using it because at the time we stored the
> > > > > sockets Tx queue in the skb, so it made sense to edit it there if we
> > > > > wanted to tweak things before it got to the qdisc layer. However it
> > > > > didn't have a direct impact on the hardware and only really affected
> > > > > the software routing in the device, which eventually resulted in which
> > > > > hardware queue and qdisc was selected.
> > > > >
> > > > > The problem with editing the receive queue is that the hardware
> > > > > offloaded case versus the software offloaded can have very different
> > > > > behaviors. I wonder if this wouldn't be better served by being an
> > > >
> > > > Could you please explain how the hardware offload and software cases
> > > > behave differently in the skbedit case. From Jakub's suggestion on
> > > >
> https://lore.kernel.org/netdev/20220503084732.363b89cc@kernel.org/,
> > > > it looked like the skbedit action fits better to align the hardware and
> > > > software description of RX queue offload (considering the skb metadata
> > > > remains same in offload vs no-offload case).
> > >
> > > So specifically my concern is RPS. The problem is RPS takes place
> > > before your TC rule would be applied in the software case, but after
> > > it has been applied in the hardware case. As a result the behavior
> > > will be different for one versus the other. With the redirect action
> > > it will pull the packet out of the Rx pipeline and reinsert it so that
> > > RPS will be applied to the packet and it would be received on the CPUs
> > > expected.
> > >
> >
> > Okay, so I understand that without HW offload, the SW behavior would
> > not align for RPS, i.e., RPS CPU would be from a queue (already selected
> > by HW, RSS etc.), and may not align with the queue selected from
> > the SW TC rule. And I see your point, the solution to this would be
> > reinserting the packet after updating the queue. But, as I look more into
> > this, there are still some more concerns I have.
> >
> > IIUC, we may be looking at a potential TC rule as below:
> > tc filter add dev ethX ingress ... \
> > action mirred ingress redirect dev ethX rxqueue <rx_qid>
> >
> > It looks to me that this configuration could possibly result in loops
> > recursively calling act_mirred. Since the redirection is from ingress
> > to ingress on the same device, when the packet is reinserted into the
> > RX pipeline of the same device, RPS and tc classification happens again,
> > the tc filter with act mirred executes redirecting and reinserting the
> > packet again. act_mirred keeps a CPU counter of recursive calls for the
> > action and drops the packet when the limit is reached.
> > If this is a valid configuration, I cannot find any code that perhaps uses
> > a combination of skb->redirect and skb->from_ingress to check and
> > prevent recursive classification (further execution of TC mirred redirect
> > action).
>
> The recursion problem is easily solved by simply not requeueing again
> if the packet is on the queue it is supposed to be. If you have rules
> that are bouncing the traffic between two queues it wouldn't be any
> different than the potential issue of bouncing traffic between two
> netdevs which is why the recursion counters were added.
>
Okay, makes sense. So, redirecting (ingress to ingress) on the same
device with the queue_mapping extension would make it possible to
update the queue. Without the queue_mapping extension, ingress to
ingress redirect on the same device would simply bounce the packets and
eventually drop them once the recursion limit is reached.
> > Also, since reinserting the packet after updating the queue would fix
> > the RPS inconsistency, can this be done from the skbedit action instead
> > of mirred redirect ? So, if skbedit action is used for Rx queue selection,
> > maybe this sequence helps:
> >
> > RPS on RX q1 -> TC action skbedit RX q2 ->
> > always reinsert if action skbedit is on RX -> RPS on RX q2 ->
> > stop further execution of TC action RX skbedit
>
> That is changing the function of skbedit. In addition the skbedit
> action isn't meant to be offloaded. The skbedit action is only really
> supposed to edit skb medatada, it isn't supposed to take other actions
> on the frame. What we want to avoid is making skbedit into another
> mirred.
Okay, so skbedit changing the skb metadata and doing the additional
action here (reinserting the packet) is not right. But in terms of skbedit
action not meant to be offloaded, I do find that skbedit action allows
offload of certain parameters like mark, ptype and priority.
Powered by blists - more mailing lists