lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <630b4379-751a-4bf1-a249-f2e051ec77d6@suse.cz>
Date: Wed, 2 Jul 2025 09:30:30 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>, Zi Yan
 <ziy@...dia.com>, Barry Song <baohua@...nel.org>,
 Carlos Maiolino <cem@...nel.org>, linux-xfs@...r.kernel.org,
 Dave Chinner <david@...morbit.com>
Cc: syzbot <syzbot+359a67b608de1ef72f65@...kaller.appspotmail.com>,
 akpm@...ux-foundation.org, apopple@...dia.com, byungchul@...com,
 david@...hat.com, gourry@...rry.net, joshua.hahnjy@...il.com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, matthew.brost@...el.com,
 rakie.kim@...com, syzkaller-bugs@...glegroups.com,
 ying.huang@...ux.alibaba.com, Harry Yoo <harry.yoo@...cle.com>,
 Michal Hocko <mhocko@...e.com>, Matthew Wilcox <willy@...radead.org>
Subject: Re: [syzbot] [mm?] WARNING in xfs_init_fs_context

+CC xfs and few more

On 7/2/25 3:41 AM, Tetsuo Handa wrote:
> On 2025/07/02 0:01, Zi Yan wrote:
>>>  __alloc_frozen_pages_noprof+0x319/0x370 mm/page_alloc.c:4972
>>>  alloc_pages_mpol+0x232/0x4a0 mm/mempolicy.c:2419
>>>  alloc_slab_page mm/slub.c:2451 [inline]
>>>  allocate_slab+0xe2/0x3b0 mm/slub.c:2627
>>>  new_slab mm/slub.c:2673 [inline]
>>
>> new_slab() allows __GFP_NOFAIL, since GFP_RECLAIM_MASK has it.
>> In allocate_slab(), the first allocation without __GFP_NOFAIL
>> failed, the retry used __GFP_NOFAIL but kmem_cache order
>> was greater than 1, which led to the warning above.
>>
>> Maybe allocate_slab() should just fail when kmem_cache
>> order is too big and first trial fails? I am no expert,
>> so add Vlastimil for help.

Thanks Zi. Slab shouldn't fail with __GFP_NOFAIL, that would only lead
to subsystems like xfs to reintroduce their own forever retrying
wrappers again. I think it's going the best it can for the fallback
attempt by using the minimum order, so the warning will never happen due
to the calculated optimal order being too large, but only if the
kmalloc()/kmem_cache_alloc() requested/object size is too large itself.

Hm but perhaps enabling slab_debug can inflate it over the threshold, is
it the case here? I think in that rare case we could convert such
fallback allocations to large kmalloc to avoid adding the debugging
overhead - we can't easily create an individual slab page without the
debugging layout for a kmalloc cache with debugging enabled.

>> Barry, who added the nofail
>> warning is cc’d.

Barry's commit 903edea6c53f0 reorganized the warnings, but it existed
already long before.

> Indeed. In allocate_slab(struct kmem_cache *s, gfp_t flags, int node),
> 
> 	/*
> 	 * Let the initial higher-order allocation fail under memory pressure
> 	 * so we fall-back to the minimum order allocation.
> 	 */
> 	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
> 	if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min))
> 		alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_RECLAIM;
> 
> 	slab = alloc_slab_page(alloc_gfp, node, oo);
> 	if (unlikely(!slab)) {
> 		oo = s->min;
> 		alloc_gfp = flags;
> 		/*
> 		 * Allocation may have failed due to fragmentation.
> 		 * Try a lower order alloc if possible
> 		 */
> 		slab = alloc_slab_page(alloc_gfp, node, oo);
> 
> __GFP_NOFAIL needs to be dropped unless s->min is either 0 or 1.

No, that would violate __GFP_NOFAIL semantics.

> 
> 		if (unlikely(!slab))
> 			return NULL;
> 		stat(s, ORDER_FALLBACK);
> 	}
> 
> 
> 
> By the way, why is xfs_init_fs_context() using __GFP_NOFAIL ?
> 
> 	mp = kzalloc(sizeof(struct xfs_mount), GFP_KERNEL | __GFP_NOFAIL);
> 	if (!mp)
> 		return -ENOMEM;
> 
> This looks an allocation attempt which can fail safely.

Indeed. Dave Chinner's commit f078d4ea82760 ("xfs: convert kmem_alloc()
to kmalloc()") dropped the xfs wrapper. This allocation didn't use
KM_MAYFAIL so it got __GFP_NOFAIL. The commit mentions this high-order
nofail issue for another allocation site that had to use xlog_kvmalloc().

I think either this allocation really can fail as the code (return
-ENOMEM) suggests and thus can drop __GFP_NOFAIL, or it can use
kvmalloc() - I think the wrapper for that can be removed now too after
the discussion in [1] resulted in commit 46459154f997 ("mm: kvmalloc:
make kmalloc fast path real fast path").

[1] https://lore.kernel.org/all/Z_XI6vBE8v_cIhjZ@dread.disaster.area/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ