[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3a11f1e2-ee5d-676f-2666-0cee8bcbed6b@kernel.org>
Date: Fri, 4 Aug 2023 14:09:49 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Wei Fang <wei.fang@....com>,
Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc: "brouer@...hat.com" <brouer@...hat.com>,
dl-linux-imx <linux-imx@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
Andrew Lunn <andrew@...n.ch>,
"davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
Shenwei Wang <shenwei.wang@....com>,
Clark Wang <xiaoning.wang@....com>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"john.fastabend@...il.com" <john.fastabend@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH V3 net-next] net: fec: add XDP_TX feature support
On 02/08/2023 14.33, Wei Fang wrote:
>>> + struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
>> XDP_TX can avoid this conversion to xdp_frame.
>> It would requires some refactor of fec_enet_txq_xmit_frame().
>>
> Yes, but I'm not intend to change it, using the existing interface is enough.
>
>>> + struct fec_enet_private *fep = netdev_priv(ndev);
>>> + struct fec_enet_priv_tx_q *txq;
>>> + int cpu = smp_processor_id();
>>> + struct netdev_queue *nq;
>>> + int queue, ret;
>>> +
>>> + queue = fec_enet_xdp_get_tx_queue(fep, cpu);
>>> + txq = fep->tx_queue[queue];
Notice how TXQ gets selected based on CPU.
Thus it will be the same for all the frames.
>>> + nq = netdev_get_tx_queue(fep->netdev, queue);
>>> +
>>> + __netif_tx_lock(nq, cpu);
>>
>> It is sad that XDP_TX takes a lock for each frame.
>>
> Yes, but the XDP path share the queue with the kernel network stack, so
> we need a lock here, unless there is a dedicated queue for XDP path. Do
> you have a better solution?
>
Yes, the solution would be to keep a stack local (or per-CPU) queue for
all the XDP_TX frames, and send them at the xdp_do_flush_map() call
site. This is basically what happens with xdp_do_redirect() in cpumap.c
and devmap.c code, that have a per-CPU bulk queue and sends a bulk of
packets into fec_enet_xdp_xmit / ndo_xdp_xmit.
I understand if you don't want to add the complexity to the driver.
And I guess, it should be a followup patch to make sure this actually
improves performance.
--Jesper
Powered by blists - more mailing lists