[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211209163828.223815bd@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Thu, 9 Dec 2021 16:38:28 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Justin Iurman <justin.iurman@...ege.be>
Cc: netdev@...r.kernel.org, davem@...emloft.net, dsahern@...nel.org,
yoshfuji@...ux-ipv6.org, linux-mm@...ck.org, cl@...ux.com,
penberg@...nel.org, rientjes@...gle.com,
iamjoonsoo kim <iamjoonsoo.kim@....com>,
akpm@...ux-foundation.org, vbabka@...e.cz,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <nikolay@...dia.com>,
Andrew Lunn <andrew@...n.ch>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Vladimir Oltean <olteanv@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Florian Westphal <fw@...len.de>,
Paolo Abeni <pabeni@...hat.com>
Subject: Re: [RFC net-next 2/2] ipv6: ioam: Support for Buffer occupancy
data field
On Thu, 9 Dec 2021 15:10:24 +0100 (CET) Justin Iurman wrote:
> > because Linux routers can run a full telemetry stack and all sort
> > of advanced SW instrumentation. The use case for reporting kernel
> > memory use via IOAM's constrained interface does not seem particularly
> > practical since it's not providing a very strong signal on what's
> > going on.
>
> I agree and disagree. I disagree because this value definitely tells you
> that something (potentially bad) is going on, when it increases
> significantly enough to reach a critical threshold. Basically, we need
> more skb's, but oh, the pool is exhausted. OK, not a problem, expand the
> pool. Oh wait, no memory left. Why? Is it only due to too much
> (temporary?) load? Should I put the blame on the NIC? Is it a memory
> issue? Is it something else? Or maybe several issues combined? Well, you
> might not know exactly why (though you know there is a problem), which is
> also why I agree with you. But, this is also why you have other data
> fields available (i.e., detecting a problem might require 2+ symptoms
> instead of just one).
>
> > For switches running Linux the switch ASIC buffer occupancy can be read
> > via devlink-sb that'd seem like a better fit for me, but unfortunately
> > the devlink calls can sleep so we can't read such device info from the
> > datapath.
>
> Indeed, would be a better fit. I didn't know about this one, thanks for
> that. It's a shame it can't be used in this context, though. But, at the
> end of the day, we're left with nothing regarding buffer occupancy. So
> I'm wondering if "something" is not better than "nothing" in this case.
> And, for that, we're back to my previous answer on why I agree and
> disagree with what you said about its utility.
I think we're on the same page, the main problem is I've not seen
anyone use the skbuff_head_cache occupancy as a signal in practice.
I'm adding a bunch of people to the CC list, hopefully someone has
an opinion one way or the other.
Lore link to the full thread, FWIW:
https://lore.kernel.org/all/20211206211758.19057-1-justin.iurman@uliege.be/
Powered by blists - more mailing lists