[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77945d26-2961-df8f-d0b5-8ecafafefd7e@redhat.com>
Date: Wed, 8 Sep 2021 18:47:09 +0200
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>
Cc: brouer@...hat.com, moyufeng <moyufeng@...wei.com>,
Yunsheng Lin <linyunsheng@...wei.com>, davem@...emloft.net,
alexander.duyck@...il.com, linux@...linux.org.uk, mw@...ihalf.com,
linuxarm@...neuler.org, yisen.zhuang@...wei.com,
salil.mehta@...wei.com, thomas.petazzoni@...tlin.com,
hawk@...nel.org, ast@...nel.org, daniel@...earbox.net,
john.fastabend@...il.com, akpm@...ux-foundation.org,
peterz@...radead.org, will@...nel.org, willy@...radead.org,
vbabka@...e.cz, fenghua.yu@...el.com, guro@...com,
peterx@...hat.com, feng.tang@...el.com, jgg@...pe.ca,
mcroce@...rosoft.com, hughd@...gle.com, jonathan.lemon@...il.com,
alobakin@...me, willemb@...gle.com, wenxu@...oud.cn,
cong.wang@...edance.com, haokexin@...il.com, nogikh@...gle.com,
elver@...gle.com, yhs@...com, kpsingh@...nel.org,
andrii@...nel.org, kafai@...com, songliubraving@...com,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org, chenhao288@...ilicon.com
Subject: Re: [PATCH net-next v2 4/4] net: hns3: support skb's frag page
recycling based on page pool
On 08/09/2021 17.57, Jakub Kicinski wrote:
> On Wed, 8 Sep 2021 18:26:35 +0300 Ilias Apalodimas wrote:
>>> Normally I'd say put the stats in ethtool -S and the rest in debugfs
>>> but I'm not sure if exposing pages_state_hold_cnt and
>>> pages_state_release_cnt directly. Those are short counters, and will
>>> very likely wrap. They are primarily meaningful for calculating
>>> page_pool_inflight(). Given this I think their semantics may be too
>>> confusing for an average ethtool -S user.
>>>
>>> Putting all the information in debugfs seems like a better idea.
>>
>> I can't really disagree on the aforementioned stats being confusing.
>> However at some point we'll want to add more useful page_pool stats (e.g the
>> percentage of the page/page fragments that are hitting the recycling path).
>> Would it still be 'ok' to have info split across ethtool and debugfs?
>
> Possibly. We'll also see what Alex L comes up with for XDP stats. Maybe
> we can arrive at a netlink API for standard things (broken record).
>
> You said percentage - even tho I personally don't like it - there is a
> small precedent of ethtool -S containing non-counter information (IOW
> not monotonically increasing event counters), e.g. some vendors rammed
> PCI link quality in there. So if all else fails ethtool -S should be
> fine.
I agree with Ilias, that we ought-to add some page_pool stats.
*BUT* ONLY if this doesn't hurt performance!!!
We have explained before, how this is possible, e.g. by keeping consumer
vs. producer counters on separate cache-lines (internally in page_pool
struct and likely on per CPU for returning pages). Then the drivers
ethtool functions can request the page_pool to fillout a driver provided
stats area, such that the collection and aggregation of counters are not
on the fast-path.
I definitely don't want to see pages_state_hold_cnt and
pages_state_release_cnt being exposed directly. These were carefully
designed to not hurt performance. An inflight counter can be deducted by
above ethtool-driver step and presented to userspace.
Notice that while developing page_pool, I've been using tracepoints and
bpftrace scripts to inspect the behavior and internals of page_pool.
See[1] and I've even written a page leak detector[2].
In principle you could write a bpftrace tool that extract stats, the
same way. But I would only recommend doing this for devel phase, because
these tracepoints do add some overhead.
Originally I wanted to push people to use this for stats, but I've
realized that not having these stats easy available is annoying ;-)
-Jesper
[1]
https://github.com/xdp-project/xdp-project/tree/master/areas/mem/bpftrace
[2]
https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/page_pool_track_leaks02.bt
Powered by blists - more mailing lists