linux-kernel - Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z60VE9SJHXEtfIbw@snowbird>
Date: Wed, 12 Feb 2025 13:39:31 -0800
From: Dennis Zhou <dennis@...nel.org>
To: Tejun Heo <tj@...nel.org>
Cc: Michal Hocko <mhocko@...e.com>, Filipe Manana <fdmanana@...e.com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic

Hello,

On Wed, Feb 12, 2025 at 11:30:08AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Wed, Feb 12, 2025 at 09:53:20PM +0100, Michal Hocko wrote:
> ...
> > > Hmm... you'd a better judge on whether that'd be okay or not but it does
> > > bother me that we might be increasing the chance of allocation failures for
> > > GFP_KERNEL users at least under memory pressure.
> > 
> > Nope, this will not change the allocation failure mode. Reclaim
> > constrains do not change the failure mode they just change how much the
> > allocation might struggle to reclaim to succeed. 
> >
> > My undocumented assumption (another dept on my end) is that pcp
> > allocations are no hot paths. So the worst case is that GFP_KERNEL
> > pcp_allocation could have been satisfied _easier_ (i.e. faster) because
> > it could have reclaimed fs/io caches and now it needs to rely on kswapd
> > to do that on memory tight situations. On the other hand we have a
> > situation when NOIO/FS allocations fail prematurely so there is
> > certainly some pros and cons.
> 
> I'm having a hard time following. Are you saying that it won't increase the
> likelihood of allocation failures even under memory pressure but that it
> might just make allocations take longer to succeed?
> 
> NOFS/IO prevents allocation attempt from entering fs/io reclaim paths,
> right? It would still trigger kswapd for reclaim but can the allocation
> attempt wait for that to finish? If so, wouldn't that constitute a
> dependency cycle all the same?
> 
> All in all, percpu allocations taking longer under memory pressure is fine.
> Becoming more prone to allocation failures, especially for GFP_KERNEL
> callers, probably isn't great.
> 

Wait, I think I'm interpreting this change differently. This is
preventing the worker from allocating backing pages via GFP_KERNEL. It
isn't preventing an allocation via alloc_percpu() from being GFP_KERNEL
and providing those flags down to the backing page code. alloc_percpu()
for GFP_KERNEL allocations will populate the pages before returning.

I'm reading this as potentially making atomic percpu allocations fail as
we might be low on backing pages. This change makes the worker now need
to wait for kswapd to give it pages. Consequently, if there are a lot of
allocations coming in when it's low, we might burn a bit of cpu from the
worker now.

We could take the time to split out pcpu_alloc_mutex and pcpu_lock more
to provide finer grain / concurrrent allocations. But I don't currently
have a justification for it.

> > As I've said I am no pcp allocator expert so I cannot really make proper
> > judgment calls. I can improve the changelog or move from scope to
> > specific gfp flags but I do not feel like I am positioned to make deeper
> > changes to the subsystem.
> 
> I don't think deciding whether always using NOIO/FS is a good idea requires
> knowing the percpu allocator that well. It's just depending on the
> underlying page allocator for that part.
> 
> Thanks.
> 
> -- 
> tejun

Thanks,
Dennis