[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<PAXPR04MB85103ACE868025F45EC172AC887F2@PAXPR04MB8510.eurprd04.prod.outlook.com>
Date: Wed, 9 Oct 2024 12:52:00 +0000
From: Wei Fang <wei.fang@....com>
To: Vladimir Oltean <vladimir.oltean@....com>
CC: "davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
<edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>, Claudiu Manoil
<claudiu.manoil@....com>, "ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>, "hawk@...nel.org"
<hawk@...nel.org>, "john.fastabend@...il.com" <john.fastabend@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "bpf@...r.kernel.org"
<bpf@...r.kernel.org>, "stable@...r.kernel.org" <stable@...r.kernel.org>,
"imx@...ts.linux.dev" <imx@...ts.linux.dev>, "rkannoth@...vell.com"
<rkannoth@...vell.com>, "maciej.fijalkowski@...el.com"
<maciej.fijalkowski@...el.com>, "sbhatta@...vell.com" <sbhatta@...vell.com>
Subject: RE: [PATCH v3 net 2/3] net: enetc: fix the issues of XDP_REDIRECT
feature
> -----Original Message-----
> From: Vladimir Oltean <vladimir.oltean@....com>
> Sent: 2024年10月9日 19:35
> To: Wei Fang <wei.fang@....com>
> Cc: davem@...emloft.net; edumazet@...gle.com; kuba@...nel.org;
> pabeni@...hat.com; Claudiu Manoil <claudiu.manoil@....com>;
> ast@...nel.org; daniel@...earbox.net; hawk@...nel.org;
> john.fastabend@...il.com; linux-kernel@...r.kernel.org;
> netdev@...r.kernel.org; bpf@...r.kernel.org; stable@...r.kernel.org;
> imx@...ts.linux.dev; rkannoth@...vell.com; maciej.fijalkowski@...el.com;
> sbhatta@...vell.com
> Subject: Re: [PATCH v3 net 2/3] net: enetc: fix the issues of XDP_REDIRECT
> feature
>
> Commit title still mentions only XDP_REDIRECT, whereas implementation also
> touches XDP_TX (and only makes a very minor mention of it).
>
> Wouldn't it be better to have "net: enetc: block concurrent XDP transmissions
> during ring reconfiguration" for a commit title?
>
> On Wed, Oct 09, 2024 at 05:03:26PM +0800, Wei Fang wrote:
> > When testing the XDP_REDIRECT function on the LS1028A platform, we
> > found a very reproducible issue that the Tx frames can no longer be
> > sent out even if XDP_REDIRECT is turned off. Specifically, if there is
> > a lot of traffic on Rx direction, when XDP_REDIRECT is turned on, the
> > console may display some warnings like "timeout for tx ring #6 clear",
> > and all redirected frames will be dropped, the detaild log
>
> detailed
>
> > is as follows.
> >
> > root@...028ardb:~# ./xdp-bench redirect eno0 eno2 Redirecting from
> > eno0 (ifindex 3; driver fsl_enetc) to eno2 (ifindex 4; driver
> > fsl_enetc) [203.849809] fsl_enetc 0000:00:00.2 eno2: timeout for tx
> > ring #5 clear [204.006051] fsl_enetc 0000:00:00.2 eno2: timeout for tx
> > ring #6 clear [204.161944] fsl_enetc 0000:00:00.2 eno2: timeout for tx
> > ring #7 clear
> > eno0->eno2 1420505 rx/s 1420590 err,drop/s 0 xmit/s
> > xmit eno0->eno2 0 xmit/s 1420590 drop/s 0 drv_err/s
> 15.71 bulk-avg
> > eno0->eno2 1420484 rx/s 1420485 err,drop/s 0 xmit/s
> > xmit eno0->eno2 0 xmit/s 1420485 drop/s 0 drv_err/s
> 15.71 bulk-avg
> >
> > By analyzing the XDP_REDIRECT implementation of enetc driver, we found
> > two problems. First, enetc driver will reconfigure Tx and Rx BD rings
> > when a bpf program is installed or uninstalled, but there is no
> > mechanisms to block the redirected frames when enetc driver
> > reconfigures BD rings. So introduce ENETC_TX_DOWN flag to
>
> (.. driver reconfigures BD rings.) Similarly, XDP_TX verdicts on received frames
> can also lead to frames being enqueued in the TX rings.
> Because XDP ignores the state set by the netif_tx_wake_queue() API, we also
> have to introduce the ENETC_TX_DOWN flag to suppress transmission of XDP
> frames.
>
> > prevent the redirected frames to be attached to Tx BD rings. This is
> > not only used to block XDP_REDIRECT frames, but also to block XDP_TX
> > frames.
> >
> > Second, Tx BD rings are disabled first in enetc_stop() and then wait
> > for empty. This operation is not safe while the Tx BD ring
>
> the driver waits for them to become empty.
>
> > is actively transmitting frames, and will cause the ring to not be
> > empty and hardware exception. As described in the block guide of
> > LS1028A NETC, software should only disable an active ring after all
> > pending ring entries have been consumed (i.e. when PI = CI).
> > Disabling a transmit ring that is actively processing BDs risks a
> > HW-SW race hazard whereby a hardware resource becomes assigned to work
> > on one or more ring entries only to have those entries be removed due
> > to the ring becoming disabled. So the correct behavior is that the
> > software stops putting frames on the Tx BD rings (this is what
> > ENETC_TX_DOWN does), then waits for the Tx BD rings to be empty, and
> > finally disables the Tx BD rings.
>
> It feels like this separate part (refactoring of enetc_start() and
> enetc_stop() operation ordering) should be its own patch? It is logically
> different than the introduction and checking of the ENETC_TX_DOWN
> condition.
Okay, I will separate this patch into two patches, one is for ENETC_TX_DOWN,
the other is for disabling Tx BDRs after the rings are empty.
Thanks.
Powered by blists - more mailing lists