[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh03Qx6zNS_yhhsf5gPah=2=mi7+dKMNCVrKhw6+894ag@mail.gmail.com>
Date: Wed, 11 Sep 2019 10:47:59 +0100
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Peter Xu <peterx@...hat.com>
Cc: Linux-MM <linux-mm@...ck.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
David Hildenbrand <david@...hat.com>,
Hugh Dickins <hughd@...gle.com>,
Maya Gokhale <gokhale2@...l.gov>,
Jerome Glisse <jglisse@...hat.com>,
Pavel Emelyanov <xemul@...tuozzo.com>,
Johannes Weiner <hannes@...xchg.org>,
Martin Cracauer <cracauer@...s.org>,
Marty McFadden <mcfadden8@...l.gov>, Shaohua Li <shli@...com>,
Andrea Arcangeli <aarcange@...hat.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Denis Plotnikov <dplotnikov@...tuozzo.com>,
Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Mel Gorman <mgorman@...e.de>,
"Kirill A . Shutemov" <kirill@...temov.name>,
"Dr . David Alan Gilbert" <dgilbert@...hat.com>
Subject: Re: [PATCH v3 7/7] mm/gup: Allow VM_FAULT_RETRY for multiple times
On Wed, Sep 11, 2019 at 8:11 AM Peter Xu <peterx@...hat.com> wrote:
>
> This is the gup counterpart of the change that allows the
> VM_FAULT_RETRY to happen for more than once. One thing to mention is
> that we must check the fatal signal here before retry because the GUP
> can be interrupted by that, otherwise we can loop forever.
I still get nervous about the signal handling here.
I'm not entirely sure we get it right even before your patch series.
Right now, __get_user_pages() can return -ERESTARTSYS when it's killed:
/*
* If we have a pending SIGKILL, don't keep faulting pages and
* potentially allocating memory.
*/
if (fatal_signal_pending(current)) {
ret = -ERESTARTSYS;
goto out;
}
and I don't think your series changes that. And note how this is true
_regardless_ of any FOLL_xyz flags (and we don't pass the
FAULT_FLAG_xyz flags at all, they get generated deeper down if we
actually end up faulting things in).
So this part of the patch:
+ if (fatal_signal_pending(current))
+ goto out;
+
*locked = 1;
- lock_dropped = true;
down_read(&mm->mmap_sem);
ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED,
- pages, NULL, NULL);
+ pages, NULL, locked);
+ if (!*locked) {
+ /* Continue to retry until we succeeded */
+ BUG_ON(ret != 0);
+ goto retry;
just makes me go "that can't be right". The fatal_signal_pending() is
pointless and would probably better be something like
if (down_read_killable(&mm->mmap_sem) < 0)
goto out;
and then _after_ calling __get_user_pages(), the whole "negative error
handling" should be more obvious.
The BUG_ON(ret != 0) makes me nervous, but it might be fine (I guess
the fatal signal handling has always been done before the lock is
released?).
But exactly *because* __get_user_pages() can already return on fatal
signals, I think it should also set FAULT_FLAG_KILLABLE when faulting
things in. I don't think it does right now, so it doesn't actually
necessarily check fatal signals in a timely manner (not _during_ the
fault, only _before_ it).
See what I'm reacting to?
And maybe I'm wrong. Maybe I misread the code, and your changes. And
at least some of these logic error predate your changes, I just was
hoping that since this whole "react to signals" is very much what your
patch series is working on, you'd look at this.
Ok?
Linus
Powered by blists - more mailing lists