lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a6fa9d6-53b8-93cd-16c8-309ce2b8e3ac@suse.cz>
Date:   Wed, 7 Jun 2023 10:58:40 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Lorenzo Stoakes <lstoakes@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Baoquan He <bhe@...hat.com>,
        Uladzislau Rezki <urezki@...il.com>,
        Christoph Hellwig <hch@...radead.org>,
        Bagas Sanjaya <bagasdotme@...il.com>,
        Linux btrfs <linux-btrfs@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>,
        Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>, a1bert@...as.cz,
        Forza <forza@...nline.net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Song Liu <song@...nel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH] mm/vmalloc: do not output a spurious warning when huge
 vmalloc() fails


On 6/6/23 09:40, Lorenzo Stoakes wrote:
> On Tue, Jun 06, 2023 at 09:13:24AM +0200, Vlastimil Babka wrote:
>>
>> On 6/5/23 22:11, Lorenzo Stoakes wrote:
>>> In __vmalloc_area_node() we always warn_alloc() when an allocation
>>> performed by vm_area_alloc_pages() fails unless it was due to a pending
>>> fatal signal.
>>>
>>> However, huge page allocations instigated either by vmalloc_huge() or
>>> __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or
>>> kvmalloc_node()) always falls back to order-0 allocations if the huge page
>>> allocation fails.
>>>
>>> This renders the warning useless and noisy, especially as all callers
>>> appear to be aware that this may fallback. This has already resulted in at
>>> least one bug report from a user who was confused by this (see link).
>>>
>>> Therefore, simply update the code to only output this warning for order-0
>>> pages when no fatal signal is pending.
>>>
>>> Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410
>>> Signed-off-by: Lorenzo Stoakes <lstoakes@...il.com>
>>
>> I think there are more reports of same thing from the btrfs context, that
>> appear to be a 6.3 regression
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=217466
>> Link: https://lore.kernel.org/all/efa04d56-cd7f-6620-bca7-1df89f49bf4b@gmail.com/
>>
>> If this indeed helps, it would make sense to Cc: stable here. Although I
>> don't see what caused the regression, the warning itself is not new, so is
>> it new source of order-9 attempts in vmalloc() or new reasons why order-9
>> pages would not be possible to allocate?
> 
> Linus updated kvmalloc() to use huge vmalloc() allocations in 9becb6889130
> ("kvmalloc: use vmalloc_huge for vmalloc allocations") and Song update
> alloc_large_system_hash() to as well in f2edd118d02d ("page_alloc: use
> vmalloc_huge for large system hash") both of which are ~1y old, however
> these would impact ~5.18, so it's weird to see reports citing 6.2 -> 6.3.
> 
> Will dig to see if something else changed that would increase the
> prevalence of this.

I think I found the commit from 6.3 that effectively exposed this warning.
As this is a tracked regression I would really suggest moving the fix to
mm-hotfixes instead of mm-unstable, and

Fixes: 80b1d8fdfad1 ("mm: vmalloc: correct use of __GFP_NOWARN mask in __vmalloc_area_node()")
Cc: <stable@...r.kernel.org>

> Also while we're here, ugh at us immediately splitting the non-compound
> (also ugh) huge page. Nicholas explains why in the patch that introduces it
> - 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split
> rather than compound") - but it'd be nice if we could find a way to avoid
> this.
> 
> If only there were a data type (perhaps beginning with 'f') that abstracted
> the order of the page entirely and could be guaranteed to always be the one
> with which you manipulated ref count, etc... ;)
> 
>>
>>> ---
>>>  mm/vmalloc.c | 17 +++++++++++++----
>>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index ab606a80f475..e563f40ad379 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3149,11 +3149,20 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>>>  	 * allocation request, free them via vfree() if any.
>>>  	 */
>>>  	if (area->nr_pages != nr_small_pages) {
>>> -		/* vm_area_alloc_pages() can also fail due to a fatal signal */
>>> -		if (!fatal_signal_pending(current))
>>> +		/*
>>> +		 * vm_area_alloc_pages() can fail due to insufficient memory but
>>> +		 * also:-
>>> +		 *
>>> +		 * - a pending fatal signal
>>> +		 * - insufficient huge page-order pages
>>> +		 *
>>> +		 * Since we always retry allocations at order-0 in the huge page
>>> +		 * case a warning for either is spurious.
>>> +		 */
>>> +		if (!fatal_signal_pending(current) && page_order == 0)
>>>  			warn_alloc(gfp_mask, NULL,
>>> -				"vmalloc error: size %lu, page order %u, failed to allocate pages",
>>> -				area->nr_pages * PAGE_SIZE, page_order);
>>> +				"vmalloc error: size %lu, failed to allocate pages",
>>> +				area->nr_pages * PAGE_SIZE);
>>>  		goto fail;
>>>  	}
>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ