[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZXxn/0oixJxxAnpF@casper.infradead.org>
Date: Fri, 15 Dec 2023 14:51:43 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: akpm@...ux-foundation.org, david@...hat.com, ying.huang@...el.com,
ziy@...dia.com, xuyu@...ux.alibaba.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: migrate: fix getting incorrect page mapping during
page migration
On Fri, Dec 15, 2023 at 08:07:52PM +0800, Baolin Wang wrote:
> When running stress-ng testing, we found below kernel crash after a few hours:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> pc : dentry_name+0xd8/0x224
> lr : pointer+0x22c/0x370
> sp : ffff800025f134c0
> ......
> Call trace:
> dentry_name+0xd8/0x224
> pointer+0x22c/0x370
> vsnprintf+0x1ec/0x730
> vscnprintf+0x2c/0x60
> vprintk_store+0x70/0x234
> vprintk_emit+0xe0/0x24c
> vprintk_default+0x3c/0x44
> vprintk_func+0x84/0x2d0
> printk+0x64/0x88
> __dump_page+0x52c/0x530
> dump_page+0x14/0x20
[...]
> There are seveval ways to fix this issue:
> (1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving
> 'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target
> page has not built mappings yet.
> (2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing
> the system, however, there are still some PFN walkers that call page_mapping()
> without holding the page lock, such as compaction.
> (3) Using target page->private field to save the 'anon_vma' pointer and 2 bits
> page state, just as page->mapping records an anonymous page, which can remove
> the page_mapping() impact for PFN walkers and also seems a simple way.
>
> So I choose option 3 to fix this issue, and this can also fix other potential
> issues for PFN walkers, such as compaction.
I'm not saying no to this fix, but dump_mapping() is supposed to be
resilient against this. Is the issue that 'dentry' is NULL, or is it
some field within dentry that is NULL? eg, would this fix your
case?
+++ b/fs/inode.c
@@ -588,7 +588,7 @@ void dump_mapping(const struct address_space *mapping)
}
dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
- if (get_kernel_nofault(dentry, dentry_ptr)) {
+ if (get_kernel_nofault(dentry, dentry_ptr) || !dentry) {
pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
a_ops, ino, dentry_ptr);
return;
Just to be clear, I think we should fix both the dumping and the migration
code.
Powered by blists - more mailing lists