[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y1Mh2S7fUGQ/iKFR@iweiny-desk3>
Date: Fri, 21 Oct 2022 15:48:57 -0700
From: Ira Weiny <ira.weiny@...el.com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: Matthew Wilcox <willy@...radead.org>,
kernel test robot <yujie.liu@...el.com>,
"Fabio M. De Francesco" <fmdefrancesco@...il.com>,
<lkp@...ts.01.org>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
Peter Xu <peterx@...hat.com>
Subject: Re: [shmem] 7a7256d5f5: WARNING:possible_recursive_locking_detected
On Fri, Oct 21, 2022 at 01:30:41PM -0700, Andrew Morton wrote:
> On Fri, 21 Oct 2022 14:09:16 +0100 Matthew Wilcox <willy@...radead.org> wrote:
>
> > On Fri, Oct 21, 2022 at 12:10:17PM +0800, kernel test robot wrote:
> > > FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with gcc-11):
> > >
> > > commit: 7a7256d5f512b6c17957df7f59cf5e281b3ddba3 ("shmem: convert shmem_mfill_atomic_pte() to use a folio")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > Ummm. Looks to me like this now occurs because of this part of the
> > change:
> >
> > if (!zeropage) { /* COPY */
> > - page_kaddr = kmap_atomic(page);
> > + page_kaddr = kmap_local_folio(folio, 0);
> > ret = copy_from_user(page_kaddr,
> > (const void __user *)src_addr,
> > PAGE_SIZE);
> > - kunmap_atomic(page_kaddr);
> > + kunmap_local(page_kaddr);
> >
> > Should I be using __copy_from_user_inatomic() here?
I would say not. I'm curious why copy_from_user() was safe (at least did not
fail the checkers). :-/
>
> Caller __mcopy_atomic() is holding mmap_read_lock(dst_mm) and this
> copy_from_user() calls
> might_fault()->might_lock_read(current->mm->mmap_lock).
>
> And I guess might_lock_read() gets upset because we're holding another
> mm's mmap_lock. Which sounds OK to me, unless a concurrent
> mmap_write_lock() could jam things up.
>
> But I cannot see why your patch would suddenly trigger this warning -
> kmap_local_folio() and kmap_atomic() are basically the same thing.
It is related to your patch but I think what you did made sense on the surface.
On the surface copy_from_user() should not require pagefaults to be disabled.
But that side affect of kmap_atomic() was being used here because it looks like
the code is designed to fallback if the fault was not allowed:[1]
mm/shmem.c
...
page_kaddr = kmap_local_folio(folio, 0);
ret = copy_from_user(page_kaddr,
(const void __user *)src_addr,
PAGE_SIZE);
kunmap_local(page_kaddr);
/* fallback to copy_from_user outside mmap_lock */
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
if (unlikely(ret)) {
*pagep = &folio->page;
ret = -ENOENT;
/* don't free the page */
goto out_unacct_blocks;
}
...
So this is one of those rare places where the kmap_atomic() side effects were
being depended on... :-(
[1] might_fault() does not actually mean the code completes the fault.
mm/memory.c
...
void __might_fault(const char *file, int line)
{
if (pagefault_disabled())
return;
...
>
> I see that __mcopy_atomic() is using plain old kmap(), perhaps to work
> around this? But that's 2015 code and I'm not sure we had such
> detailed lock checking in those days.
No kmap() can't work around this. That works because the lock is released just
above that.
mm/userfaultfd.c
...
mmap_read_unlock(dst_mm);
BUG_ON(!page);
page_kaddr = kmap(page);
err = copy_from_user(page_kaddr,
(const void __user *) src_addr,
PAGE_SIZE);
kunmap(page);
...
So I think the correct solution is below because we want to prevent the page
fault.
Ira
diff --git a/mm/shmem.c b/mm/shmem.c
index 8280a5cb48df..6c8e99bf5983 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2424,9 +2424,11 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
if (!zeropage) { /* COPY */
page_kaddr = kmap_local_folio(folio, 0);
+ pagefault_disable()
ret = copy_from_user(page_kaddr,
(const void __user *)src_addr,
PAGE_SIZE);
+ pagefault_enable()
kunmap_local(page_kaddr);
/* fallback to copy_from_user outside mmap_lock */
Powered by blists - more mailing lists