[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f782b460-38fc-4c2b-b886-870760a96ece@kernel.org>
Date: Thu, 22 Feb 2024 11:10:44 +0100
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Toke Høiland-Jørgensen <toke@...hat.com>,
bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH RFC net-next 1/2] net: Reference bpf_redirect_info via
task_struct on PREEMPT_RT.
On 22/02/2024 10.22, Sebastian Andrzej Siewior wrote:
> On 2024-02-20 16:32:08 [+0100], To Jesper Dangaard Brouer wrote:
>>>
>>> Ethtool(i40e2) stat: 15028585 ( 15,028,585) <= tx-0.packets /sec
>>> Ethtool(i40e2) stat: 15028589 ( 15,028,589) <= tx_packets /sec
>>
>> -t1 in ixgbe
>> Show adapter(s) (eth1) statistics (ONLY that changed!)
>> Ethtool(eth1 ) stat: 107857263 ( 107,857,263) <= tx_bytes /sec
>> Ethtool(eth1 ) stat: 115047684 ( 115,047,684) <= tx_bytes_nic /sec
>> Ethtool(eth1 ) stat: 1797621 ( 1,797,621) <= tx_packets /sec
>> Ethtool(eth1 ) stat: 1797636 ( 1,797,636) <= tx_pkts_nic /sec
>> Ethtool(eth1 ) stat: 107857263 ( 107,857,263) <= tx_queue_0_bytes /sec
>> Ethtool(eth1 ) stat: 1797621 ( 1,797,621) <= tx_queue_0_packets /sec
> …
>> while sending with ixgbe while running perf top on the box:
>> | Samples: 621K of event 'cycles', 4000 Hz, Event count (approx.): 49979376685 lost: 0/0 drop: 0/0
>> | Overhead CPU Command Shared Object Symbol
>> | 31.98% 000 kpktgend_0 [kernel] [k] xas_find
>> | 6.72% 000 kpktgend_0 [kernel] [k] pfn_to_dma_pte
>> | 5.63% 000 kpktgend_0 [kernel] [k] ixgbe_xmit_frame_ring
>> | 4.78% 000 kpktgend_0 [kernel] [k] dma_pte_clear_level
>> | 3.16% 000 kpktgend_0 [kernel] [k] __iommu_dma_unmap
>
> I disabled the iommu and I get to
Yes, clearly IOMMU code that cause the performance issue for you.
This driver doesn't use page_pool, so I want to point out (for people
finding this post in the future) that page_pool keeps DMA mappings for
recycled frame, which should address the IOMMU overhead issue here.
>
> Ethtool(eth1 ) stat: 14158562 ( 14,158,562) <= tx_packets /sec
> Ethtool(eth1 ) stat: 14158685 ( 14,158,685) <= tx_pkts_nic /sec
>
> looks like a small improvement… It is not your 15 but close. -t2 does
> improve the situation.
You cannot reach 15Mpps on 10Gbit/s as wirespeed for 10G is 14.88Mpps.
Congratulations, I think this 14.15 Mpps is as close to wirespeed as it
possible on your hardware.
BTW what CPU are you using?
> There is a warning from DMA mapping code but ;)
It is a warning from IOMMU code?
It usually means there is a real DMA unmap bug (which we should fix).
--Jesper
Powered by blists - more mailing lists