[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230914235238.GB129171@monkey>
Date: Thu, 14 Sep 2023 16:52:38 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Mel Gorman <mgorman@...hsingularity.net>,
Miaohe Lin <linmiaohe@...wei.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Zi Yan <ziy@...dia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
In next-20230913, I started hitting the following BUG. Seems related
to this series. And, if series is reverted I do not see the BUG.
I can easily reproduce on a small 16G VM. kernel command line contains
"hugetlb_free_vmemmap=on hugetlb_cma=4G". Then run the script,
while true; do
echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
done
For the BUG below I believe it was the first (or second) 1G page creation from
CMA that triggered: cma_alloc of 1G.
Sorry, have not looked deeper into the issue.
[ 28.643019] page:ffffea0004fb4280 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13ed0a
[ 28.645455] flags: 0x200000000000000(node=0|zone=2)
[ 28.646835] page_type: 0xffffffff()
[ 28.647886] raw: 0200000000000000 dead000000000100 dead000000000122 0000000000000000
[ 28.651170] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 28.653124] page dumped because: VM_BUG_ON_PAGE(is_migrate_isolate(mt))
[ 28.654769] ------------[ cut here ]------------
[ 28.655972] kernel BUG at mm/page_alloc.c:1231!
[ 28.657139] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 28.658354] CPU: 2 PID: 885 Comm: bash Not tainted 6.6.0-rc1-next-20230913+ #3
[ 28.660090] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
[ 28.662054] RIP: 0010:free_pcppages_bulk+0x192/0x240
[ 28.663284] Code: 22 48 89 45 08 8b 44 24 0c 41 29 44 24 04 41 29 c6 41 83 f8 05 0f 85 4c ff ff ff 48 c7 c6 20 a5 22 82 48 89 df e8 4e cf fc ff <0f> 0b 65 8b 05 41 8b d3 7e 89 c0 48 0f a3 05 fb 35 39 01 0f 83 40
[ 28.667422] RSP: 0018:ffffc90003b9faf0 EFLAGS: 00010046
[ 28.668643] RAX: 000000000000003b RBX: ffffea0004fb4280 RCX: 0000000000000000
[ 28.670245] RDX: 0000000000000000 RSI: ffffffff8224dace RDI: 00000000ffffffff
[ 28.671920] RBP: ffffea0004fb4288 R08: 0000000000009ffb R09: 00000000ffffdfff
[ 28.673614] R10: 00000000ffffdfff R11: ffffffff824660c0 R12: ffff888477c30540
[ 28.675213] R13: ffff888477c30550 R14: 00000000000012f5 R15: 000000000013ed0a
[ 28.676832] FS: 00007f60039b9740(0000) GS:ffff888477c00000(0000) knlGS:0000000000000000
[ 28.678709] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 28.680046] CR2: 00005615f9bf3048 CR3: 00000003128b6005 CR4: 0000000000370ee0
[ 28.682897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 28.684501] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 28.686098] Call Trace:
[ 28.686792] <TASK>
[ 28.687414] ? die+0x32/0x80
[ 28.688197] ? do_trap+0xd6/0x100
[ 28.689069] ? free_pcppages_bulk+0x192/0x240
[ 28.690135] ? do_error_trap+0x6a/0x90
[ 28.691082] ? free_pcppages_bulk+0x192/0x240
[ 28.692187] ? exc_invalid_op+0x49/0x60
[ 28.693154] ? free_pcppages_bulk+0x192/0x240
[ 28.694225] ? asm_exc_invalid_op+0x16/0x20
[ 28.695291] ? free_pcppages_bulk+0x192/0x240
[ 28.696405] drain_pages_zone+0x3f/0x50
[ 28.697404] __drain_all_pages+0xe2/0x1e0
[ 28.698472] alloc_contig_range+0x143/0x280
[ 28.699581] ? bitmap_find_next_zero_area_off+0x3d/0x90
[ 28.700902] cma_alloc+0x156/0x470
[ 28.701852] ? kernfs_fop_write_iter+0x160/0x1f0
[ 28.703053] alloc_fresh_hugetlb_folio+0x7e/0x270
[ 28.704272] alloc_pool_huge_page+0x7d/0x100
[ 28.705448] set_max_huge_pages+0x162/0x390
[ 28.706530] nr_hugepages_store_common+0x91/0xf0
[ 28.707689] kernfs_fop_write_iter+0x108/0x1f0
[ 28.708819] vfs_write+0x207/0x400
[ 28.709743] ksys_write+0x63/0xe0
[ 28.710640] do_syscall_64+0x37/0x90
[ 28.712649] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 28.713919] RIP: 0033:0x7f6003aade87
[ 28.714879] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 28.719096] RSP: 002b:00007ffdfd9d2e98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 28.720945] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f6003aade87
[ 28.722626] RDX: 0000000000000002 RSI: 00005615f9bac620 RDI: 0000000000000001
[ 28.724288] RBP: 00005615f9bac620 R08: 000000000000000a R09: 00007f6003b450c0
[ 28.725939] R10: 00007f6003b44fc0 R11: 0000000000000246 R12: 0000000000000002
[ 28.727611] R13: 00007f6003b81520 R14: 0000000000000002 R15: 00007f6003b81720
[ 28.729285] </TASK>
[ 28.729944] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_seq 9p snd_seq_device netfs joydev snd_pcm snd_timer 9pnet_virtio snd soundcore virtio_balloon 9pnet virtio_console virtio_net virtio_blk net_failover failover crct10dif_pclmul crc32_pclmul crc32c_intel virtio_pci ghash_clmulni_intel serio_raw virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring fuse
[ 28.739325] ---[ end trace 0000000000000000 ]---
--
Mike Kravetz
Powered by blists - more mailing lists