linux-kernel - Re: [PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing with mmu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJuCfpEywUsBxKW5DCHCa_XK45ewhnULia75zoZ9ehW9nsYAMA@mail.gmail.com>
Date:   Thu, 23 Feb 2023 12:29:58 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        akpm@...ux-foundation.org, michel@...pinasse.org,
        jglisse@...gle.com, mhocko@...e.com, vbabka@...e.cz,
        hannes@...xchg.org, mgorman@...hsingularity.net, dave@...olabs.net,
        willy@...radead.org, peterz@...radead.org, ldufour@...ux.ibm.com,
        paulmck@...nel.org, mingo@...hat.com, will@...nel.org,
        luto@...nel.org, songliubraving@...com, peterx@...hat.com,
        david@...hat.com, dhowells@...hat.com, hughd@...gle.com,
        bigeasy@...utronix.de, kent.overstreet@...ux.dev,
        punit.agrawal@...edance.com, lstoakes@...il.com,
        peterjung1337@...il.com, rientjes@...gle.com, chriscli@...gle.com,
        axelrasmussen@...gle.com, joelaf@...gle.com, minchan@...gle.com,
        rppt@...nel.org, jannh@...gle.com, shakeelb@...gle.com,
        tatashin@...gle.com, edumazet@...gle.com, gthelen@...gle.com,
        gurua@...gle.com, arjunroy@...gle.com, soheil@...gle.com,
        leewalsh@...gle.com, posk@...gle.com,
        michalechner92@...glemail.com, linux-mm@...ck.org,
        linux-arm-kernel@...ts.infradead.org,
        linuxppc-dev@...ts.ozlabs.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing
 with mmu_notifier registration

On Thu, Feb 23, 2023 at 12:06 PM Liam R. Howlett
<Liam.Howlett@...cle.com> wrote:
>
> * Suren Baghdasaryan <surenb@...gle.com> [230216 00:18]:
> > Page fault handlers might need to fire MMU notifications while a new
> > notifier is being registered. Modify mm_take_all_locks to write-lock all
> > VMAs and prevent this race with page fault handlers that would hold VMA
> > locks. VMAs are locked before i_mmap_rwsem and anon_vma to keep the same
> > locking order as in page fault handlers.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > ---
> >  mm/mmap.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 00f8c5798936..801608726be8 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -3501,6 +3501,7 @@ static void vm_lock_mapping(struct mm_struct *mm, struct address_space *mapping)
> >   * of mm/rmap.c:
> >   *   - all hugetlbfs_i_mmap_rwsem_key locks (aka mapping->i_mmap_rwsem for
> >   *     hugetlb mapping);
> > + *   - all vmas marked locked
> >   *   - all i_mmap_rwsem locks;
> >   *   - all anon_vma->rwseml
> >   *
> > @@ -3523,6 +3524,13 @@ int mm_take_all_locks(struct mm_struct *mm)
> >
> >       mutex_lock(&mm_all_locks_mutex);
> >
> > +     mas_for_each(&mas, vma, ULONG_MAX) {
> > +             if (signal_pending(current))
> > +                     goto out_unlock;
> > +             vma_start_write(vma);
> > +     }
> > +
> > +     mas_set(&mas, 0);
> >       mas_for_each(&mas, vma, ULONG_MAX) {
> >               if (signal_pending(current))
> >                       goto out_unlock;
>
> Do we need a vma_end_write_all(mm) in the out_unlock unrolling?

We can't really do that because some VMAs might have been locked
before mm_take_all_locks() got called. So, we will have to wait until
mmap write lock is dropped and vma_end_write_all() is called from
there. Getting a signal while executing mm_take_all_locks() is
probably not too common and won't pose a performance issue.

>
> Also, does this need to honour the strict locking order that we have to
> add an entire new loop?  This function is...suboptimal today, but if we
> could get away with not looping through every VMA for a 4th time, that
> would be nice.

That's what I used to do until Jann pointed out the locking order
requirement to avoid deadlocks in here:
https://lore.kernel.org/all/CAG48ez3EAai=1ghnCMF6xcgUebQRm-u2xhwcpYsfP9=r=oVXig@mail.gmail.com/.

>
> > @@ -3612,6 +3620,7 @@ void mm_drop_all_locks(struct mm_struct *mm)
> >               if (vma->vm_file && vma->vm_file->f_mapping)
> >                       vm_unlock_mapping(vma->vm_file->f_mapping);
> >       }
> > +     vma_end_write_all(mm);
> >
> >       mutex_unlock(&mm_all_locks_mutex);
> >  }
> > --
> > 2.39.1
> >
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@...roid.com.
>