[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.2009040402560.14993@file01.intranet.prod.int.rdu2.redhat.com>
Date: Fri, 4 Sep 2020 04:08:26 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Peter Xu <peterx@...hat.com>, Jann Horn <jannh@...gle.com>,
Christoph Hellwig <hch@....de>,
Oleg Nesterov <oleg@...hat.com>,
Kirill Shutemov <kirill@...temov.name>,
Jan Kara <jack@...e.cz>,
Andrea Arcangeli <aarcange@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dan Williams <dan.j.williams@...el.com>,
Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>
Subject: Re: a crash when running strace from persistent memory
On Thu, 3 Sep 2020, Linus Torvalds wrote:
> On Thu, Sep 3, 2020 at 12:24 PM Mikulas Patocka <mpatocka@...hat.com> wrote:
> >
> > There's a bug when you run strace from dax-based filesystem.
> >
> > -- create real or emulated persistent memory device (/dev/pmem0)
> > mkfs.ext2 /dev/pmem0
> > -- mount it
> > mount -t ext2 -o dax /dev/pmem0 /mnt/test
> > -- copy the system to it (well, you can copy just a few files that are
> > needed for running strace and ls)
> > cp -ax / /mnt/test
> > -- bind the system directories
> > mount --bind /dev /mnt/test/dev
> > mount --bind /proc /mnt/test/proc
> > mount --bind /sys /mnt/test/sys
> > -- run strace on the ls command
> > chroot /mnt/test/ strace /bin/ls
> >
> > You get this warning and ls is killed with SIGSEGV.
> >
> > I bisected the problem and it is caused by the commit
> > 17839856fd588f4ab6b789f482ed3ffd7c403e1f (gup: document and work around
> > "COW can break either way" issue). When I revert the patch (on the kernel
> > 5.9-rc3), the bug goes away.
>
> Funky. I really don't see how it could cause that, but we have the
> UDDF issue too, so I'm guessing I will have to fix it the radical way
> with Peter Xu's series based on my "rip out COW special cases" patch.
>
> Or maybe I'm just using that as an excuse for really wanting to apply
> that series.. Because we can't just revert that GUP commit due to
> security concerns.
>
> > [ 84.191504] WARNING: CPU: 6 PID: 1350 at mm/memory.c:2486 wp_page_copy.cold+0xdb/0xf6
>
> I'm assuming this is the WARN_ON_ONCE(1) on line 2482, and you have
> some extra debug patch that causes that line to be off by 4? Because
> at least for me, line 2486 is actually an empty line in v5.9-rc3.
Yes, that's it. I added a few printk to look at the control flow.
> That said, I really think this is a pre-existing race, and all the
> "COW can break either way" patch does is change the timing (presumably
> due to the actual pattern of actually doing the COW changing).
>
> See commit c3e5ea6ee574 ("mm: avoid data corruption on CoW fault into
> PFN-mapped VMA") for background.
>
> Mikulas, can you check that everything works ok for that case if you
> apply Peter's series? See
>
> https://lore.kernel.org/lkml/20200821234958.7896-1-peterx@redhat.com/
I applied these four patches and strace works well. There is no longer any
warning or crash.
Mikulas
> or if you have 'b4' installed, use
>
> b4 am 20200821234958.7896-1-peterx@...hat.com
>
> to get the series..
>
> Linus
>
Powered by blists - more mailing lists