linux-kernel - Re: [PATCH v2 4/6] mm: drop VMA lock before waiting for migration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJuCfpGZvhBUdfNHojXwqZbspuhy0bstjT+-JMfwgmnqTnkoHA@mail.gmail.com>
Date:   Mon, 12 Jun 2023 09:07:38 -0700
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     akpm@...ux-foundation.org, willy@...radead.org, hannes@...xchg.org,
        mhocko@...e.com, josef@...icpanda.com, jack@...e.cz,
        ldufour@...ux.ibm.com, laurent.dufour@...ibm.com,
        michel@...pinasse.org, liam.howlett@...cle.com, jglisse@...gle.com,
        vbabka@...e.cz, minchan@...gle.com, dave@...olabs.net,
        punit.agrawal@...edance.com, lstoakes@...il.com, hdanton@...a.com,
        apopple@...dia.com, ying.huang@...el.com, david@...hat.com,
        yuzhao@...gle.com, dhowells@...hat.com, hughd@...gle.com,
        viro@...iv.linux.org.uk, brauner@...nel.org,
        pasha.tatashin@...een.com, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...roid.com
Subject: Re: [PATCH v2 4/6] mm: drop VMA lock before waiting for migration

On Mon, Jun 12, 2023 at 6:36 AM Peter Xu <peterx@...hat.com> wrote:
>
> On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote:
> > On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > >
> > > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@...hat.com> wrote:
> > > >
> > > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > > lock was dropped while in handle_mm_fault().
> > > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > > there are no guarantees it was not freed.
> > > >
> > > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > > still make them match on behaviors, or there's reason not to do so?
> > >
> > > I think we could match their behavior by also dropping mmap_lock here
> > > when fault is handled under mmap_lock (!(fault->flags &
> > > FAULT_FLAG_VMA_LOCK)).
> > > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > > of reusing existing flags?
> > Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
> > above reply.
> >
> > I took a closer look into using VM_FAULT_COMPLETED instead of
> > VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
> > mmap_lock we need to retry with an indication that the per-vma lock
> > was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
> > indicate such state seems strange to me ("retry" and "complete" seem
>
> Not relevant to this migration patch, but for the whole idea I was thinking
> whether it should just work if we simply:
>
>         fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> -       vma_end_read(vma);
> +       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> +               vma_end_read(vma);
>
> ?

Today when we can't handle a page fault under per-vma locks we return
VM_FAULT_RETRY, in which case per-vma lock is dropped and the fault is
retried under mmap_lock. The condition you suggest above would not
drop per-vma lock for VM_FAULT_RETRY case and would break the current
fallback mechanism.
However your suggestion gave me an idea. I could indicate that per-vma
lock got dropped using vmf structure (like Matthew suggested before)
and once handle_pte_fault(vmf) returns I could check if it returned
VM_FAULT_RETRY but per-vma lock is still held. If that happens I can
call vma_end_read() before returning from __handle_mm_fault(). That
way any time handle_mm_fault() returns VM_FAULT_RETRY per-vma lock
will be already released, so your condition in do_page_fault() will
work correctly. That would eliminate the need for a new
VM_FAULT_VMA_UNLOCKED flag. WDYT?

>
> GUP may need more caution on NOWAIT, but vma lock is only in fault paths so
> IIUC it's fine?
>
> --
> Peter Xu
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@...roid.com.
>