[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36f1c727-846b-0b81-192c-d2ecfce1fbf8@suse.com>
Date: Mon, 2 May 2022 09:08:16 +0200
From: Juergen Gross <jgross@...e.com>
To: Liam Howlett <liam.howlett@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Guenter Roeck <linux@...ck-us.net>,
"maple-tree@...ts.infradead.org" <maple-tree@...ts.infradead.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Yu Zhao <yuzhao@...gle.com>
Subject: Re: [PATCH v8 23/70] mm/mmap: change do_brk_flags() to expand
existing VMA and add do_brk_munmap()
On 02.05.22 02:14, Liam Howlett wrote:
> * Andrew Morton <akpm@...ux-foundation.org> [220428 21:16]:
>> On Fri, 29 Apr 2022 00:38:50 +0000 Liam Howlett <liam.howlett@...cle.com> wrote:
>>
>>>> mm/mmap.c: In function 'do_brk_flags':
>>>> mm/mmap.c:2908:17: error: implicit declaration of function
>>>> 'khugepaged_enter_vma_merge'; did you mean 'khugepaged_enter_vma'?
>>>>
>>>> It appears that this is later fixed, but it hurts bisectability
>>>> (and prevents me from finding the actual build failure in linux-next
>>>> when trying to build corenet64_smp_defconfig).
>>>
>>> Yeah, that khugepaged_enter_vma_merge was renamed in another patch set.
>>> Andrew made the correction but kept the patch as it was. I think the
>>> suggested change is right.. if you read the commit that introduced
>>> khugepaged_enter_vma(), it seems right at least.
>>
>> Things are a bit crazy lately. Merge issues with mapletree, merge
>> issues with mglru on mapletree, me doing a bunch of retooling to start
>> publishing/merging via git, mapletree runtime issues, etc.
>>
>> I've dropped the mapletree patches again. Please scoop up all known
>> fixes and redo against the (non-rebasing) mm-stable branch at
>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>
> Okay, sounds good.
>
> I have been porting my patches over and hit a bit of a snag. It looked
> like my patches were not booting on the s390 - but not all the time. So
> I reverted back to mm-stable (059342d1dd4e) and found that also failed
> to boot sometimes on my qemu setup. When it fails it's ~4-5sec into
> booting. The last thing I see is:
>
> "[ 4.668916] Spectre V2 mitigation: execute trampolines"
>
> I've bisected back to commit e553f62f10d9 (mm, page_alloc: fix
> build_zonerefs_node())
>
> With the this commit, I am unable to boot one out of three times. When
> using the previous commit I was not able to get it to hang after trying
> 10+ times. This is a qemu s390 install with KASAN on and I see no error
> messages. I think it's likely it is this patch, but no guaranteed.
This sounds like a race condition during the setup of memory zones.
I could imagine my patch is triggering this problem, but it should
not be the real root cause.
I'm no expert regarding zone setup, but I think it might help to
print some zone data in case the problem is happening. Which data is
needed I have no real idea, but maybe someone else can help here. The
following diff should recognize the problematic case (it might show
false positives, though):
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e42038382c1..23f029f39985 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6132,6 +6132,9 @@ static int build_zonerefs_node(pg_data_t *pgdat, struct
zoneref *zonerefs)
zone_type--;
zone = pgdat->node_zones + zone_type;
if (populated_zone(zone)) {
+ if (!managed_zone(zone)) {
+ /* Print some data regarding the zone. */
+ }
zoneref_set_zone(zone, &zonerefs[nr_zones++]);
check_highest_zone(zone_type);
}
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists