[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0792956c-c2d3-0102-5d41-8fccc5091b08@engleder-embedded.com>
Date: Fri, 21 Apr 2023 20:54:06 +0200
From: Gerhard Engleder <gerhard@...leder-embedded.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org, davem@...emloft.net,
kuba@...nel.org, edumazet@...gle.com, pabeni@...hat.com,
bjorn@...nel.org, magnus.karlsson@...el.com,
jonathan.lemon@...il.com
Subject: Re: [PATCH net-next v3 5/6] tsnep: Add XDP socket zero-copy RX
support
On 20.04.23 21:46, Maciej Fijalkowski wrote:
> On Tue, Apr 18, 2023 at 09:04:58PM +0200, Gerhard Engleder wrote:
>> Add support for XSK zero-copy to RX path. The setup of the XSK pool can
>> be done at runtime. If the netdev is running, then the queue must be
>> disabled and enabled during reconfiguration. This can be done easily
>> with functions introduced in previous commits.
>>
>> A more important property is that, if the netdev is running, then the
>> setup of the XSK pool shall not stop the netdev in case of errors. A
>> broken netdev after a failed XSK pool setup is bad behavior. Therefore,
>> the allocation and setup of resources during XSK pool setup is done only
>> before any queue is disabled. Additionally, freeing and later allocation
>> of resources is eliminated in some cases. Page pool entries are kept for
>> later use. Two memory models are registered in parallel. As a result,
>> the XSK pool setup cannot fail during queue reconfiguration.
>>
>> In contrast to other drivers, XSK pool setup and XDP BPF program setup
>> are separate actions. XSK pool setup can be done without any XDP BPF
>> program. The XDP BPF program can be added, removed or changed without
>> any reconfiguration of the XSK pool.
>>
>> Test results with A53 1.2GHz:
>>
>> xdpsock rxdrop copy mode:
>> pps pkts 1.00
>> rx 856,054 10,625,775
>> Two CPUs with both 100% utilization.
>>
>> xdpsock rxdrop zero-copy mode:
>> pps pkts 1.00
>> rx 889,388 4,615,284
>> Two CPUs with 100% and 20% utilization.
>>
>> xdpsock l2fwd copy mode:
>> pps pkts 1.00
>> rx 248,985 7,315,885
>> tx 248,921 7,315,885
>> Two CPUs with 100% and 10% utilization.
>>
>> xdpsock l2fwd zero-copy mode:
>> pps pkts 1.00
>> rx 254,735 3,039,456
>> tx 254,735 3,039,456
>> Two CPUs with 100% and 4% utilization.
>
> Thanks for sharing the numbers. This is for 64 byte frames?
Yes. I will add that information.
>>
>> Packet rate increases and CPU utilization is reduced in both cases.
>> 100% CPU load seems to the base load. This load is consumed by ksoftirqd
>> just for dropping the generated packets without xdpsock running.
>>
>> Using batch API reduced CPU utilization slightly, but measurements are
>> not stable enough to provide meaningful numbers.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@...leder-embedded.com>
>> ---
>> drivers/net/ethernet/engleder/tsnep.h | 13 +-
>> drivers/net/ethernet/engleder/tsnep_main.c | 494 ++++++++++++++++++++-
>> drivers/net/ethernet/engleder/tsnep_xdp.c | 66 +++
>> 3 files changed, 558 insertions(+), 15 deletions(-)
>>
>
> (...)
>
>> static const struct net_device_ops tsnep_netdev_ops = {
>> .ndo_open = tsnep_netdev_open,
>> .ndo_stop = tsnep_netdev_close,
>> @@ -1713,6 +2177,7 @@ static const struct net_device_ops tsnep_netdev_ops = {
>> .ndo_setup_tc = tsnep_tc_setup,
>> .ndo_bpf = tsnep_netdev_bpf,
>> .ndo_xdp_xmit = tsnep_netdev_xdp_xmit,
>> + .ndo_xsk_wakeup = tsnep_netdev_xsk_wakeup,
>> };
>>
>> static int tsnep_mac_init(struct tsnep_adapter *adapter)
>> @@ -1973,7 +2438,8 @@ static int tsnep_probe(struct platform_device *pdev)
>>
>> netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
>> NETDEV_XDP_ACT_NDO_XMIT |
>> - NETDEV_XDP_ACT_NDO_XMIT_SG;
>> + NETDEV_XDP_ACT_NDO_XMIT_SG |
>> + NETDEV_XDP_ACT_XSK_ZEROCOPY;
>
> In theory enabling this feature here before implementing Tx ZC can expose
> you to some broken behavior, so just for the sake of completeness, i would
> move this to Tx ZC patch.
Will be done.
Powered by blists - more mailing lists