[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a6fa9d6-53b8-93cd-16c8-309ce2b8e3ac@suse.cz>
Date: Wed, 7 Jun 2023 10:58:40 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Lorenzo Stoakes <lstoakes@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Baoquan He <bhe@...hat.com>,
Uladzislau Rezki <urezki@...il.com>,
Christoph Hellwig <hch@...radead.org>,
Bagas Sanjaya <bagasdotme@...il.com>,
Linux btrfs <linux-btrfs@...r.kernel.org>,
Linux Regressions <regressions@...ts.linux.dev>,
Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
David Sterba <dsterba@...e.com>, a1bert@...as.cz,
Forza <forza@...nline.net>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Song Liu <song@...nel.org>,
Nicholas Piggin <npiggin@...il.com>,
Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH] mm/vmalloc: do not output a spurious warning when huge
vmalloc() fails
On 6/6/23 09:40, Lorenzo Stoakes wrote:
> On Tue, Jun 06, 2023 at 09:13:24AM +0200, Vlastimil Babka wrote:
>>
>> On 6/5/23 22:11, Lorenzo Stoakes wrote:
>>> In __vmalloc_area_node() we always warn_alloc() when an allocation
>>> performed by vm_area_alloc_pages() fails unless it was due to a pending
>>> fatal signal.
>>>
>>> However, huge page allocations instigated either by vmalloc_huge() or
>>> __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or
>>> kvmalloc_node()) always falls back to order-0 allocations if the huge page
>>> allocation fails.
>>>
>>> This renders the warning useless and noisy, especially as all callers
>>> appear to be aware that this may fallback. This has already resulted in at
>>> least one bug report from a user who was confused by this (see link).
>>>
>>> Therefore, simply update the code to only output this warning for order-0
>>> pages when no fatal signal is pending.
>>>
>>> Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410
>>> Signed-off-by: Lorenzo Stoakes <lstoakes@...il.com>
>>
>> I think there are more reports of same thing from the btrfs context, that
>> appear to be a 6.3 regression
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=217466
>> Link: https://lore.kernel.org/all/efa04d56-cd7f-6620-bca7-1df89f49bf4b@gmail.com/
>>
>> If this indeed helps, it would make sense to Cc: stable here. Although I
>> don't see what caused the regression, the warning itself is not new, so is
>> it new source of order-9 attempts in vmalloc() or new reasons why order-9
>> pages would not be possible to allocate?
>
> Linus updated kvmalloc() to use huge vmalloc() allocations in 9becb6889130
> ("kvmalloc: use vmalloc_huge for vmalloc allocations") and Song update
> alloc_large_system_hash() to as well in f2edd118d02d ("page_alloc: use
> vmalloc_huge for large system hash") both of which are ~1y old, however
> these would impact ~5.18, so it's weird to see reports citing 6.2 -> 6.3.
>
> Will dig to see if something else changed that would increase the
> prevalence of this.
I think I found the commit from 6.3 that effectively exposed this warning.
As this is a tracked regression I would really suggest moving the fix to
mm-hotfixes instead of mm-unstable, and
Fixes: 80b1d8fdfad1 ("mm: vmalloc: correct use of __GFP_NOWARN mask in __vmalloc_area_node()")
Cc: <stable@...r.kernel.org>
> Also while we're here, ugh at us immediately splitting the non-compound
> (also ugh) huge page. Nicholas explains why in the patch that introduces it
> - 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split
> rather than compound") - but it'd be nice if we could find a way to avoid
> this.
>
> If only there were a data type (perhaps beginning with 'f') that abstracted
> the order of the page entirely and could be guaranteed to always be the one
> with which you manipulated ref count, etc... ;)
>
>>
>>> ---
>>> mm/vmalloc.c | 17 +++++++++++++----
>>> 1 file changed, 13 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index ab606a80f475..e563f40ad379 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3149,11 +3149,20 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>>> * allocation request, free them via vfree() if any.
>>> */
>>> if (area->nr_pages != nr_small_pages) {
>>> - /* vm_area_alloc_pages() can also fail due to a fatal signal */
>>> - if (!fatal_signal_pending(current))
>>> + /*
>>> + * vm_area_alloc_pages() can fail due to insufficient memory but
>>> + * also:-
>>> + *
>>> + * - a pending fatal signal
>>> + * - insufficient huge page-order pages
>>> + *
>>> + * Since we always retry allocations at order-0 in the huge page
>>> + * case a warning for either is spurious.
>>> + */
>>> + if (!fatal_signal_pending(current) && page_order == 0)
>>> warn_alloc(gfp_mask, NULL,
>>> - "vmalloc error: size %lu, page order %u, failed to allocate pages",
>>> - area->nr_pages * PAGE_SIZE, page_order);
>>> + "vmalloc error: size %lu, failed to allocate pages",
>>> + area->nr_pages * PAGE_SIZE);
>>> goto fail;
>>> }
>>>
>>
Powered by blists - more mailing lists