[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230915141610.GA104956@cmpxchg.org>
Date: Fri, 15 Sep 2023 10:16:10 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Mike Kravetz <mike.kravetz@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Mel Gorman <mgorman@...hsingularity.net>,
Miaohe Lin <linmiaohe@...wei.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Zi Yan <ziy@...dia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> In next-20230913, I started hitting the following BUG. Seems related
> to this series. And, if series is reverted I do not see the BUG.
>
> I can easily reproduce on a small 16G VM. kernel command line contains
> "hugetlb_free_vmemmap=on hugetlb_cma=4G". Then run the script,
> while true; do
> echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
> echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> done
>
> For the BUG below I believe it was the first (or second) 1G page creation from
> CMA that triggered: cma_alloc of 1G.
>
> Sorry, have not looked deeper into the issue.
Thanks for the report, and sorry about the breakage!
I was scratching my head at this:
/* MIGRATE_ISOLATE page should not go to pcplists */
VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
because there is nothing in page isolation that prevents setting
MIGRATE_ISOLATE on something that's on the pcplist already. So why
didn't this trigger before already?
Then it clicked: it used to only check the *pcpmigratetype* determined
by free_unref_page(), which of course mustn't be MIGRATE_ISOLATE.
Pages that get isolated while *already* on the pcplist are fine, and
are handled properly:
mt = get_pcppage_migratetype(page);
/* MIGRATE_ISOLATE page should not go to pcplists */
VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
/* Pageblock could have been isolated meanwhile */
if (unlikely(isolated_pageblocks))
mt = get_pageblock_migratetype(page);
So this was purely a sanity check against the pcpmigratetype cache
operations. With that gone, we can remove it.
---
>From b0cb92ed10b40fab0921002effa8b726df245790 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@...xchg.org>
Date: Fri, 15 Sep 2023 09:59:52 -0400
Subject: [PATCH] mm: page_alloc: remove pcppage migratetype caching fix
Mike reports the following crash in -next:
[ 28.643019] page:ffffea0004fb4280 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13ed0a
[ 28.645455] flags: 0x200000000000000(node=0|zone=2)
[ 28.646835] page_type: 0xffffffff()
[ 28.647886] raw: 0200000000000000 dead000000000100 dead000000000122 0000000000000000
[ 28.651170] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 28.653124] page dumped because: VM_BUG_ON_PAGE(is_migrate_isolate(mt))
[ 28.654769] ------------[ cut here ]------------
[ 28.655972] kernel BUG at mm/page_alloc.c:1231!
This VM_BUG_ON() used to check that the cached pcppage_migratetype set
by free_unref_page() wasn't MIGRATE_ISOLATE.
When I removed the caching, I erroneously changed the assert to check
that no isolated pages are on the pcplist. This is quite different,
because pages can be isolated *after* they had been put on the
freelist already (which is handled just fine).
IOW, this was purely a sanity check on the migratetype caching. With
that gone, the check should have been removed as well. Do that now.
Reported-by: Mike Kravetz <mike.kravetz@...cle.com>
Signed-off-by: Johannes Weiner <hannes@...xchg.org>
---
mm/page_alloc.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e3f1c777feed..9469e4660b53 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1207,9 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
count -= nr_pages;
pcp->count -= nr_pages;
- /* MIGRATE_ISOLATE page should not go to pcplists */
- VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
-
__free_one_page(page, pfn, zone, order, mt, FPI_NONE);
trace_mm_page_pcpu_drain(page, order, mt);
} while (count > 0 && !list_empty(list));
--
2.42.0
Powered by blists - more mailing lists