[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83b8bae8-d524-36a1-302e-59198410d9a9@gmail.com>
Date: Wed, 18 Aug 2021 16:05:52 -0600
From: David Ahern <dsahern@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>, davem@...emloft.net,
kuba@...nel.org
Cc: alexander.duyck@...il.com, linux@...linux.org.uk, mw@...ihalf.com,
linuxarm@...neuler.org, yisen.zhuang@...wei.com,
salil.mehta@...wei.com, thomas.petazzoni@...tlin.com,
hawk@...nel.org, ilias.apalodimas@...aro.org, ast@...nel.org,
daniel@...earbox.net, john.fastabend@...il.com,
akpm@...ux-foundation.org, peterz@...radead.org, will@...nel.org,
willy@...radead.org, vbabka@...e.cz, fenghua.yu@...el.com,
guro@...com, peterx@...hat.com, feng.tang@...el.com, jgg@...pe.ca,
mcroce@...rosoft.com, hughd@...gle.com, jonathan.lemon@...il.com,
alobakin@...me, willemb@...gle.com, wenxu@...oud.cn,
cong.wang@...edance.com, haokexin@...il.com, nogikh@...gle.com,
elver@...gle.com, yhs@...com, kpsingh@...nel.org,
andrii@...nel.org, kafai@...com, songliubraving@...com,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org, chenhao288@...ilicon.com, edumazet@...gle.com,
yoshfuji@...ux-ipv6.org, dsahern@...nel.org, memxor@...il.com,
linux@...pel-privat.de, atenart@...nel.org, weiwan@...gle.com,
ap420073@...il.com, arnd@...db.de,
mathew.j.martineau@...ux.intel.com, aahringo@...hat.com,
ceggers@...i.de, yangbo.lu@....com, fw@...len.de,
xiangxia.m.yue@...il.com, linmiaohe@...wei.com
Subject: Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support
On 8/17/21 9:32 PM, Yunsheng Lin wrote:
> This patchset adds the socket to netdev page frag recycling
> support based on the busy polling and page pool infrastructure.
>
> The profermance improve from 30Gbit to 41Gbit for one thread iperf
> tcp flow, and the CPU usages decreases about 20% for four threads
> iperf flow with 100Gb line speed in IOMMU strict mode.
>
> The profermance improve about 2.5% for one thread iperf tcp flow
> in IOMMU passthrough mode.
>
Details about the test setup? cpu model, mtu, any other relevant changes
/ settings.
How does that performance improvement compare with using the Tx ZC API?
At 1500 MTU I see a CPU drop on the Tx side from 80% to 20% with the ZC
API and ~10% increase in throughput. Bumping the MTU to 3300 and
performance with the ZC API is 2x the current model with 1/2 the cpu.
Epyc 7502, ConnectX-6, IOMMU off.
In short, it seems like improving the Tx ZC API is the better path
forward than per-socket page pools.
Powered by blists - more mailing lists