[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aTABiHsL8NbGWNaL@gourry-fedora-PF4VCD3F>
Date: Wed, 3 Dec 2025 04:23:20 -0500
From: Gregory Price <gourry@...rry.net>
To: "David Hildenbrand (Red Hat)" <david@...nel.org>
Cc: Michal Hocko <mhocko@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Aboorva Devarajan <aboorvad@...ux.ibm.com>, vbabka@...e.cz,
surenb@...gle.com, jackmanb@...gle.com, hannes@...xchg.org,
ziy@...dia.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Oscar Salvador <OSalvador@...e.com>,
Juan Yescas <jyescas@...gle.com>
Subject: Re: [PATCH] mm/page_alloc: make percpu_pagelist_high_fraction reads
lock-free
On Wed, Dec 03, 2025 at 10:08:55AM +0100, David Hildenbrand (Red Hat) wrote:
> On 12/3/25 10:02, Gregory Price wrote:
> >
> > My transient failure (although i'm not sure it was actually transient, i
> > killed it and retried after a few minutes and it succeeded immediately)
> > was on a ZONE_MOVABLE block.
>
> Okay, so that one should not bail out. Longterm pinnins must never end up on
> such memory, and if it happens, we have to identify why and fix it.
>
> We have this known problem of "stream of short-term pinnings" that can
> temporarily turn memory effectively unmovable. Juan will talk about that at
> LPC [1].
Nice, fun, good topic. Looking forward to Japan n_n
>
> We have another set of problematic cases (vmsplice(), fuse) but I would
> assume that these are not the cases you are hitting.
>
We do use fuse, but this system was relatively quiet when i tried this.
We do have some proactive reclaim / demotion going on, but i don't think
it was that (see below).
> >
> > Kind of suggested to me there was some bad condition the resolved once I
> > took a second to release the lock and try again.
>
> Hard to tell I'm afraid. Do you still have the dump_folio() calls we print
> when migration fails?
>
What luck, I do! :D
And i just noticed it's the same page over and over
[ 3404.119270] migrating pfn c06f176 failed ret:1
[ 3404.129152] page: refcount:4 mapcount:0 mapping:0000000061ca20ba index:0xad28e5b pfn:0xc06f176
[ 3404.148284] memcg:ffff88842e855000
[ 3404.155834] aops:btree_aops ino:1
[ 3404.163193] flags: 0x17ffff066c00420c(referenced|uptodate|workingset|private|node=1|zone=3|lastcpupid=0x1ffff)
[ 3404.185408] raw: 17ffff066c00420c ffffc90066a13ca0 ffffc90066a13ca0 ffff88812b8502f8
[ 3404.202603] raw: 000000000ad28e5b ffff888859fd42d0 00000004ffffffff ffff88842e855000
[ 3404.219779] page dumped because: migration failure
[ 3404.230610] migrating pfn c06f176 failed ret:1
[ 3404.240483] page: refcount:4 mapcount:0 mapping:0000000061ca20ba index:0xad28e5b pfn:0xc06f176
[ 3404.259603] memcg:ffff88842e855000
[ 3404.267152] aops:btree_aops ino:1
[ 3404.274511] flags: 0x17ffff066c00420c(referenced|uptodate|workingset|private|node=1|zone=3|lastcpupid=0x1ffff)
[ 3404.296716] raw: 17ffff066c00420c ffffc90066a13ca0 ffffc90066a13ca0 ffff88812b8502f8
[ 3404.313909] raw: 000000000ad28e5b ffff888859fd42d0 00000004ffffffff ffff88842e855000
[ 3404.331102] page dumped because: migration failure
[ 3404.341778] migrating pfn c06f176 failed ret:1
[ 3404.351658] page: refcount:4 mapcount:0 mapping:0000000061ca20ba index:0xad28e5b pfn:0xc06f176
[ 3404.370781] memcg:ffff88842e855000
[ 3404.378331] aops:btree_aops ino:1
[ 3404.385687] flags: 0x17ffff066c00420c(referenced|uptodate|workingset|private|node=1|zone=3|lastcpupid=0x1ffff)
[ 3404.407895] raw: 17ffff066c00420c ffffc90066a13ca0 ffffc90066a13ca0 ffff88812b8502f8
[ 3404.425073] raw: 000000000ad28e5b ffff888859fd42d0 00000004ffffffff ffff88842e855000
[ 3404.442264] page dumped because: migration failure
[ 3404.452928] migrating pfn c06f176 failed ret:1
[ 3404.462809] page: refcount:4 mapcount:0 mapping:0000000061ca20ba index:0xad28e5b pfn:0xc06f176
[ 3404.481948] memcg:ffff88842e855000
[ 3404.489511] aops:btree_aops ino:1
[ 3404.496899] flags: 0x17ffff066c00420c(referenced|uptodate|workingset|private|node=1|zone=3|lastcpupid=0x1ffff)
[ 3404.519128] raw: 17ffff066c00420c ffffc90066a13ca0 ffffc90066a13ca0 ffff88812b8502f8
[ 3404.536332] raw: 000000000ad28e5b ffff888859fd42d0 00000004ffffffff ffff88842e855000
[ 3404.553534] page dumped because: migration failure
[ 3404.564200] migrating pfn c06f176 failed ret:1
[ 3404.574077] page: refcount:4 mapcount:0 mapping:0000000061ca20ba index:0xad28e5b pfn:0xc06f176
[ 3404.593208] memcg:ffff88842e855000
[ 3404.600769] aops:btree_aops ino:1
[ 3404.608138] flags: 0x17ffff066c00420c(referenced|uptodate|workingset|private|node=1|zone=3|lastcpupid=0x1ffff)
[ 3404.630355] raw: 17ffff066c00420c ffffc90066a13ca0 ffffc90066a13ca0 ffff88812b8502f8
[ 3404.647558] raw: 000000000ad28e5b ffff888859fd42d0 00000004ffffffff ffff88842e855000
[ 3404.664761] page dumped because: migration failure
Powered by blists - more mailing lists