lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 20 Feb 2024 12:59:23 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>, Steven Rostedt <rostedt@...dmis.org>, 
	Michal Hocko <mhocko@...e.com>, akpm@...ux-foundation.org, hannes@...xchg.org, 
	roman.gushchin@...ux.dev, mgorman@...e.de, dave@...olabs.net, 
	willy@...radead.org, liam.howlett@...cle.com, corbet@....net, 
	void@...ifault.com, peterz@...radead.org, juri.lelli@...hat.com, 
	catalin.marinas@....com, will@...nel.org, arnd@...db.de, tglx@...utronix.de, 
	mingo@...hat.com, dave.hansen@...ux.intel.com, x86@...nel.org, 
	peterx@...hat.com, david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, 
	masahiroy@...nel.org, nathan@...nel.org, dennis@...nel.org, tj@...nel.org, 
	muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org, 
	pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com, 
	dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com, 
	keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com, 
	gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com, 
	vincent.guittot@...aro.org, dietmar.eggemann@....com, bsegall@...gle.com, 
	bristot@...hat.com, vschneid@...hat.com, cl@...ux.com, penberg@...nel.org, 
	iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com, 
	elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com, 
	songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com, 
	minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com, 
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, 
	iommu@...ts.linux.dev, linux-arch@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, 
	linux-modules@...r.kernel.org, kasan-dev@...glegroups.com, 
	cgroups@...r.kernel.org, Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Subject: Re: [PATCH v3 31/35] lib: add memory allocations report in show_mem()

On Tue, Feb 20, 2024 at 10:27 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 2/19/24 18:17, Suren Baghdasaryan wrote:
> > On Thu, Feb 15, 2024 at 3:56 PM Kent Overstreet
> > <kent.overstreet@...ux.dev> wrote:
> >>
> >> On Thu, Feb 15, 2024 at 06:27:29PM -0500, Steven Rostedt wrote:
> >> > All this, and we are still worried about 4k for useful debugging :-/
> >
> > I was planning to refactor this function to print one record at a time
> > with a smaller buffer but after discussing with Kent, he has plans to
> > reuse this function and having the report in one buffer is needed for
> > that.
>
> We are printing to console, AFAICS all the code involved uses plain printk()
> I think it would be way easier to have a function using printk() for this
> use case than the seq_buf which is more suitable for /proc and friends. Then
> all concerns about buffers would be gone. It wouldn't be that much of a code
> duplication?

Ok, after discussing this with Kent, I'll change this patch to provide
a function returning N top consumers (the array and N will be provided
by the caller) and then we can print one record at a time with much
less memory needed. That should address reusability concerns, will use
memory more efficiently and will allow for more flexibility (more/less
than 10 records if needed).
Thanks for the feedback, everyone!

>
> >> Every additional 4k still needs justification. And whether we burn a
> >> reserve on this will have no observable effect on user output in
> >> remotely normal situations; if this allocation ever fails, we've already
> >> been in an OOM situation for awhile and we've already printed out this
> >> report many times, with less memory pressure where the allocation would
> >> have succeeded.
> >
> > I'm not sure this claim will always be true, specifically in the case
> > of low-end devices with relatively low amounts of reserves and in the
>
> That's right, GFP_ATOMIC failures can easily happen without prior OOMs.
> Consider a system where userspace allocations fill the memory as they
> usually do, up to high watermark. Then a burst of packets is received and
> handled by GFP_ATOMIC allocations that deplete the reserves and can't cause
> OOMs (OOM is when we fail to reclaim anything, but we are allocating from a
> context that can't reclaim), so the very first report would be an GFP_ATOMIC
> failure and now it can't allocate that buffer for printing.
>
> I'm sure more such scenarios exist, Cc: Tetsuo who I recall was an expert on
> this topic.
>
> > presence of a possible quick memory usage spike. We should also
> > consider a case when panic_on_oom is set. All we get is one OOM
> > report, so we get only one chance to capture this report. In any case,
> > I don't yet have data to prove or disprove this claim but it will be
> > interesting to test it with data from the field once the feature is
> > deployed.
> >
> > For now I think with Vlastimil's __GFP_NOWARN suggestion the code
> > becomes safe and the only risk is to lose this report. If we get cases
> > with reports missing this data, we can easily change to reserved
> > memory.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ