linux-kernel - Re: [PATCH v2 1/5] mm: introduce vma_start_read_locked{

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpFNVpmG-jaERY5Y5jSFWAVWk8NMJ=Gs0nSn+cmdADTUmg@mail.gmail.com>
Date: Wed, 13 Nov 2024 07:30:17 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: akpm@...ux-foundation.org, willy@...radead.org, liam.howlett@...cle.com, 
	mhocko@...e.com, vbabka@...e.cz, hannes@...xchg.org, mjguzik@...il.com, 
	oliver.sang@...el.com, mgorman@...hsingularity.net, david@...hat.com, 
	peterx@...hat.com, oleg@...hat.com, dave@...olabs.net, paulmck@...nel.org, 
	brauner@...nel.org, dhowells@...hat.com, hdanton@...a.com, hughd@...gle.com, 
	minchan@...gle.com, jannh@...gle.com, shakeel.butt@...ux.dev, 
	souravpanda@...gle.com, pasha.tatashin@...een.com, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 1/5] mm: introduce vma_start_read_locked{_nested} helpers

On Wed, Nov 13, 2024 at 6:10 AM Lorenzo Stoakes
<lorenzo.stoakes@...cle.com> wrote:
>
> On Tue, Nov 12, 2024 at 11:46:31AM -0800, Suren Baghdasaryan wrote:
> > Introduce helper functions which can be used to read-lock a VMA when
> > holding mmap_lock for read. Replace direct accesses to vma->vm_lock
> > with these new helpers.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > ---
> >  include/linux/mm.h | 20 ++++++++++++++++++++
> >  mm/userfaultfd.c   | 14 ++++++--------
> >  2 files changed, 26 insertions(+), 8 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index fecd47239fa9..01ce619f3d17 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -722,6 +722,26 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
> >       return true;
> >  }
> >
> > +/*
> > + * Use only while holding mmap_read_lock which guarantees that nobody can lock
> > + * the vma for write (vma_start_write()) from under us.
> > + */
> > +static inline void vma_start_read_locked_nested(struct vm_area_struct *vma, int subclass)
> > +{
> > +     mmap_assert_locked(vma->vm_mm);
> > +     down_read_nested(&vma->vm_lock->lock, subclass);
> > +}
> > +
> > +/*
> > + * Use only while holding mmap_read_lock which guarantees that nobody can lock
> > + * the vma for write (vma_start_write()) from under us.
> > + */
> > +static inline void vma_start_read_locked(struct vm_area_struct *vma)
> > +{
> > +     mmap_assert_locked(vma->vm_mm);
> > +     down_read(&vma->vm_lock->lock);
> > +}
>
> Hm, but why would we want to VMA read lock under mmap read lock again? mmap
> read lock will exclude VMA writers so it seems sort of redundant to do
> this, is it because callers expect to then release the mmap read lock
> afterwards or something?

Yes, that's the pattern used there. They kinda downgrade from mmap to
vma lock to make it more local.

>
> It feels like a quite specialist case that maybe is worth adding an
> additional comment to because to me it seems entirely redundant - the whole
> point of the VMA locks is to be able to avoid having to take an mmap lock
> on read.
>
> Something like 'while the intent of VMA read locks is to avoid having to
> take mmap locks at all, there exist certain circumstances in which we would
> need to hold the mmap read to begin with, and these helpers provide that
> functionality'.

Ok, I'll try documenting these functions better.

>
> Also, I guess naming is hard, but _locked here strikes me as confusing as
> I'd read this as if the locked refer to the VMA lock rather than the mmap
> lock.
>
> It also speaks to the need for some additional comment so nobody walks away
> thinking they _must_ take a VMA read lock _as well as_ an mmap read lock
> before reading from a VMA.
>
> Again, naming, hard :>)

I'm open to suggestions.

>
> > +
> >  static inline void vma_end_read(struct vm_area_struct *vma)
> >  {
> >       rcu_read_lock(); /* keeps vma alive till the end of up_read */
> > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > index 60a0be33766f..55019c11b5a8 100644
> > --- a/mm/userfaultfd.c
> > +++ b/mm/userfaultfd.c
> > @@ -86,13 +86,11 @@ static struct vm_area_struct *uffd_lock_vma(struct mm_struct *mm,
> >       vma = find_vma_and_prepare_anon(mm, address);
> >       if (!IS_ERR(vma)) {
> >               /*
> > +              * While holding mmap_lock we can't fail
> >                * We cannot use vma_start_read() as it may fail due to
> > -              * false locked (see comment in vma_start_read()). We
> > -              * can avoid that by directly locking vm_lock under
> > -              * mmap_lock, which guarantees that nobody can lock the
> > -              * vma for write (vma_start_write()) under us.
> > +              * false locked (see comment in vma_start_read()).
>
> Nit but 'as it may fail due to false locked' reads strangely.

replace with "overflow"?

>
> >                */
> > -             down_read(&vma->vm_lock->lock);
> > +             vma_start_read_locked(vma);
>
> Do we even need this while gross 'this is an exception to the rule' comment now
> we have new helpers?
>
> Isn't it somewhat self-documenting now and uffd is no longer a special
> snowflake?

Ack.

>
>
> >       }
> >
> >       mmap_read_unlock(mm);
> > @@ -1480,10 +1478,10 @@ static int uffd_move_lock(struct mm_struct *mm,
> >                * See comment in uffd_lock_vma() as to why not using
> >                * vma_start_read() here.
> >                */
> > -             down_read(&(*dst_vmap)->vm_lock->lock);
> > +             vma_start_read_locked(*dst_vmap);
> >               if (*dst_vmap != *src_vmap)
> > -                     down_read_nested(&(*src_vmap)->vm_lock->lock,
> > -                                      SINGLE_DEPTH_NESTING);
> > +                     vma_start_read_locked_nested(*src_vmap,
> > +                                             SINGLE_DEPTH_NESTING);
> >       }
> >       mmap_read_unlock(mm);
> >       return err;
> > --
> > 2.47.0.277.g8800431eea-goog
> >