lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211209163828.223815bd@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Thu, 9 Dec 2021 16:38:28 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Justin Iurman <justin.iurman@...ege.be>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, dsahern@...nel.org,
        yoshfuji@...ux-ipv6.org, linux-mm@...ck.org, cl@...ux.com,
        penberg@...nel.org, rientjes@...gle.com,
        iamjoonsoo kim <iamjoonsoo.kim@....com>,
        akpm@...ux-foundation.org, vbabka@...e.cz,
        Roopa Prabhu <roopa@...dia.com>,
        Nikolay Aleksandrov <nikolay@...dia.com>,
        Andrew Lunn <andrew@...n.ch>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Vladimir Oltean <olteanv@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Florian Westphal <fw@...len.de>,
        Paolo Abeni <pabeni@...hat.com>
Subject: Re: [RFC net-next 2/2] ipv6: ioam: Support for Buffer occupancy
 data field

On Thu, 9 Dec 2021 15:10:24 +0100 (CET) Justin Iurman wrote:
> > because Linux routers can run a full telemetry stack and all sort
> > of advanced SW instrumentation. The use case for reporting kernel
> > memory use via IOAM's constrained interface does not seem particularly
> > practical since it's not providing a very strong signal on what's
> > going on.  
> 
> I agree and disagree. I disagree because this value definitely tells you
> that something (potentially bad) is going on, when it increases
> significantly enough to reach a critical threshold. Basically, we need
> more skb's, but oh, the pool is exhausted. OK, not a problem, expand the
> pool. Oh wait, no memory left. Why? Is it only due to too much
> (temporary?) load? Should I put the blame on the NIC? Is it a memory
> issue? Is it something else? Or maybe several issues combined? Well, you
> might not know exactly why (though you know there is a problem), which is
> also why I agree with you. But, this is also why you have other data
> fields available (i.e., detecting a problem might require 2+ symptoms
> instead of just one).
> 
> > For switches running Linux the switch ASIC buffer occupancy can be read
> > via devlink-sb that'd seem like a better fit for me, but unfortunately
> > the devlink calls can sleep so we can't read such device info from the
> > datapath.  
> 
> Indeed, would be a better fit. I didn't know about this one, thanks for
> that. It's a shame it can't be used in this context, though. But, at the
> end of the day, we're left with nothing regarding buffer occupancy. So
> I'm wondering if "something" is not better than "nothing" in this case.
> And, for that, we're back to my previous answer on why I agree and
> disagree with what you said about its utility.

I think we're on the same page, the main problem is I've not seen
anyone use the skbuff_head_cache occupancy as a signal in practice.

I'm adding a bunch of people to the CC list, hopefully someone has
an opinion one way or the other.

Lore link to the full thread, FWIW:

https://lore.kernel.org/all/20211206211758.19057-1-justin.iurman@uliege.be/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ