linux-kernel - Re: [PATCH v3 31/35] lib: add memory allocations report in show

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e017b7bc-d747-46e6-a89d-4ce558ed79b0@suse.cz>
Date: Tue, 20 Feb 2024 19:27:26 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Suren Baghdasaryan <surenb@...gle.com>,
 Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Steven Rostedt <rostedt@...dmis.org>, Michal Hocko <mhocko@...e.com>,
 akpm@...ux-foundation.org, hannes@...xchg.org, roman.gushchin@...ux.dev,
 mgorman@...e.de, dave@...olabs.net, willy@...radead.org,
 liam.howlett@...cle.com, corbet@....net, void@...ifault.com,
 peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
 will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
 dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
 david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
 nathan@...nel.org, dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev,
 rppt@...nel.org, paulmck@...nel.org, pasha.tatashin@...een.com,
 yosryahmed@...gle.com, yuzhao@...gle.com, dhowells@...hat.com,
 hughd@...gle.com, andreyknvl@...il.com, keescook@...omium.org,
 ndesaulniers@...gle.com, vvvvvv@...gle.com, gregkh@...uxfoundation.org,
 ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com, bsegall@...gle.com, bristot@...hat.com,
 vschneid@...hat.com, cl@...ux.com, penberg@...nel.org,
 iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com,
 elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com,
 songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com,
 minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 iommu@...ts.linux.dev, linux-arch@...r.kernel.org,
 linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
 linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
 cgroups@...r.kernel.org, Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Subject: Re: [PATCH v3 31/35] lib: add memory allocations report in show_mem()

On 2/19/24 18:17, Suren Baghdasaryan wrote:
> On Thu, Feb 15, 2024 at 3:56 PM Kent Overstreet
> <kent.overstreet@...ux.dev> wrote:
>>
>> On Thu, Feb 15, 2024 at 06:27:29PM -0500, Steven Rostedt wrote:
>> > All this, and we are still worried about 4k for useful debugging :-/
> 
> I was planning to refactor this function to print one record at a time
> with a smaller buffer but after discussing with Kent, he has plans to
> reuse this function and having the report in one buffer is needed for
> that.

We are printing to console, AFAICS all the code involved uses plain printk()
I think it would be way easier to have a function using printk() for this
use case than the seq_buf which is more suitable for /proc and friends. Then
all concerns about buffers would be gone. It wouldn't be that much of a code
duplication?

>> Every additional 4k still needs justification. And whether we burn a
>> reserve on this will have no observable effect on user output in
>> remotely normal situations; if this allocation ever fails, we've already
>> been in an OOM situation for awhile and we've already printed out this
>> report many times, with less memory pressure where the allocation would
>> have succeeded.
> 
> I'm not sure this claim will always be true, specifically in the case
> of low-end devices with relatively low amounts of reserves and in the

That's right, GFP_ATOMIC failures can easily happen without prior OOMs.
Consider a system where userspace allocations fill the memory as they
usually do, up to high watermark. Then a burst of packets is received and
handled by GFP_ATOMIC allocations that deplete the reserves and can't cause
OOMs (OOM is when we fail to reclaim anything, but we are allocating from a
context that can't reclaim), so the very first report would be an GFP_ATOMIC
failure and now it can't allocate that buffer for printing.

I'm sure more such scenarios exist, Cc: Tetsuo who I recall was an expert on
this topic.

> presence of a possible quick memory usage spike. We should also
> consider a case when panic_on_oom is set. All we get is one OOM
> report, so we get only one chance to capture this report. In any case,
> I don't yet have data to prove or disprove this claim but it will be
> interesting to test it with data from the field once the feature is
> deployed.
> 
> For now I think with Vlastimil's __GFP_NOWARN suggestion the code
> becomes safe and the only risk is to lose this report. If we get cases
> with reports missing this data, we can easily change to reserved
> memory.