lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220509071447.GA123646@hori.linux.bs1.fc.nec.co.jp>
Date:   Mon, 9 May 2022 07:14:49 +0000
From:   HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>
To:     Yang Shi <shy828301@...il.com>
CC:     Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        Linux MM <linux-mm@...ck.org>,
        Matthew Wilcox <willy@...radead.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Christoph Hellwig <hch@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        John Hubbard <jhubbard@...dia.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        William Kucharski <william.kucharski@...cle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: ##freemail## Re: [PATCH v2] mm/hwpoison: use pr_err() instead of
 dump_page() in get_any_page()

On Thu, Apr 28, 2022 at 10:25:33AM -0700, Yang Shi wrote:
> On Tue, Apr 26, 2022 at 10:32 PM Naoya Horiguchi
> <naoya.horiguchi@...ux.dev> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@....com>
> >
> > The following VM_BUG_ON_FOLIO() is triggered when memory error event
> > happens on the (thp/folio) pages which are about to be freed:
> >
> >   [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
> >   [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
> >   [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
> >   [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
> >   [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
> >   [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
> >   [ 1160.251815] ------------[ cut here ]------------
> >   [ 1160.253438] kernel BUG at include/linux/mm.h:788!
> >   [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> >   [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
> >   [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> >   [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
> >   [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
> >   [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
> >   [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
> >   [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
> >   [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
> >   [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
> >   [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
> >   [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
> >   [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
> >   [ 1160.293031] Call Trace:
> >   [ 1160.293724]  <TASK>
> >   [ 1160.294334]  get_hwpoison_page+0x47d/0x570
> >   [ 1160.295474]  memory_failure+0x106/0xaa0
> >   [ 1160.296474]  ? security_capable+0x36/0x50
> >   [ 1160.297524]  hard_offline_page_store+0x43/0x80
> >   [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
> >   [ 1160.299829]  new_sync_write+0xf9/0x160
> >   [ 1160.300810]  vfs_write+0x209/0x290
> >   [ 1160.301835]  ksys_write+0x4f/0xc0
> >   [ 1160.302718]  do_syscall_64+0x3b/0x90
> >   [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >   [ 1160.304981] RIP: 0033:0x7eff54b018b7
> >
> > As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
> > called from dump_page("hwpoison: unhandlable page") in get_any_page().
> > The below explains the mechanism of the race:
> >
> >   CPU 0                                       CPU 1
> >
> >     memory_failure
> >       get_hwpoison_page
> >         get_any_page
> >           dump_page
> >             compound = PageCompound
> >                                                 free_pages_prepare
> >                                                   page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
> >             folio_entire_mapcount
> >               VM_BUG_ON_FOLIO(!folio_test_large(folio))
> >
> > So replace dump_page() with safer one, pr_err().
> >
> > Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@....com>
> > ---
> > ChangeLog v1 -> v2:
> > - v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
> > - update caller side instead of changing dump_page().
> > ---
> >  mm/memory-failure.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 35e11d6bea4a..0e1453514a2b 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
> >         }
> >  out:
> >         if (ret == -EIO)
> > -               dump_page(p, "hwpoison: unhandlable page");
> > +               pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
> 
> I think dump_page() is helpful to tell the users more information
> about the unhandlable page, I'm ok with this fix for now, but should
> we consider having a memory failure safe dump_page() in the future?

Yes, maybe that would be helpful not only in this unhandlable case, so sounds
good to me.  But how do we handle folio's case?  And I'm not sure that the full
info in dump_page() is needed in a memory_failure-specific variant.

Thanks,
Naoya Horiguchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ