[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211208141825.3091923c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Wed, 8 Dec 2021 14:18:25 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Justin Iurman <justin.iurman@...ege.be>
Cc: netdev@...r.kernel.org, davem@...emloft.net, dsahern@...nel.org,
yoshfuji@...ux-ipv6.org, linux-mm@...ck.org, cl@...ux.com,
penberg@...nel.org, rientjes@...gle.com,
iamjoonsoo kim <iamjoonsoo.kim@....com>,
akpm@...ux-foundation.org, vbabka@...e.cz
Subject: Re: [RFC net-next 2/2] ipv6: ioam: Support for Buffer occupancy
data field
On Tue, 7 Dec 2021 19:05:13 +0100 (CET) Justin Iurman wrote:
> On Dec 7, 2021, at 6:07 PM, Jakub Kicinski kuba@...nel.org wrote:
> > Hm, reading thru the quoted portion of the standard from the commit
> > message the semantics of the field are indeed pretty disappointing.
> > What's the value of defining a field in a standard if it's entirely
> > implementation specific? Eh.
>
> True. But keep also in mind the scope of IOAM which is not to be
> deployed widely on the Internet. It is deployed on limited (aka private)
> domains where each node is therefore managed by the operator. So, I'm
> not really sure why you think that the implementation specific thing is
> a problem here. Context of "unit" is provided by the IOAM Namespace-ID
> attached to the trace, as well as each Node-ID if included. Again, it's
> up to the operator to interpret values accordingly, depending on each
> node (i.e., the operator has a large and detailed view of his domain; he
> knows if the buffer occupancy value "X" is abnormal or not for a
> specific node, he knows which unit is used for a specific node, etc).
It's quite likely I'm missing the point.
> >> We probably want the metadata included for accuracy as well (e.g.,
> >> kmem_cache_size vs new function kmem_cache_full_size).
> >
> > Does the standard support carrying arbitrary metadata?
>
> It says:
>
> "This field indicates the current status of the occupancy of the
> common buffer pool used by a set of queues."
>
> So, as long as metadata are part of it, I'd say yes it does, since bytes
> are allocated for that too. Does it make sense?
Indeed, but see below.
> > Anyway, in general I personally don't have a good feeling about
> > implementing this field. Would be good to have a clear user who
> > can justify the choice of slab vs something else. Wouldn't modern
> > deployments use some form of streaming telemetry for nodes within
> > the same domain of control? I'm not sure I understand the value
> > of limited slab info in OAM when there's probably a more powerful
> > metric collection going on.
>
> Do you believe this patch does not provide what is defined in the spec?
> If so, I'm open to any suggestions.
The opposite, in a sense. I think the patch does implement behavior
within a reasonable interpretation of the standard. But the feature
itself seems more useful for forwarding ASICs than Linux routers,
because Linux routers can run a full telemetry stack and all sort
of advanced SW instrumentation. The use case for reporting kernel
memory use via IOAM's constrained interface does not seem particularly
practical since it's not providing a very strong signal on what's
going on.
For switches running Linux the switch ASIC buffer occupancy can be read
via devlink-sb that'd seem like a better fit for me, but unfortunately
the devlink calls can sleep so we can't read such device info from the
datapath.
> > Patch 1 makes perfect sense, FWIW.
>
> Thanks for (all) the feedback, Jakub, I appreciate it.
Powered by blists - more mailing lists