lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHRRsUMNHp2H3UiB4EZbe9CXTVcAC+oOR1dscENjp1Jbw@mail.gmail.com>
Date:   Tue, 17 Jan 2023 17:06:57 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     akpm@...ux-foundation.org, michel@...pinasse.org,
        jglisse@...gle.com, vbabka@...e.cz, hannes@...xchg.org,
        mgorman@...hsingularity.net, dave@...olabs.net,
        willy@...radead.org, liam.howlett@...cle.com, peterz@...radead.org,
        ldufour@...ux.ibm.com, laurent.dufour@...ibm.com,
        paulmck@...nel.org, luto@...nel.org, songliubraving@...com,
        peterx@...hat.com, david@...hat.com, dhowells@...hat.com,
        hughd@...gle.com, bigeasy@...utronix.de, kent.overstreet@...ux.dev,
        punit.agrawal@...edance.com, lstoakes@...il.com,
        peterjung1337@...il.com, rientjes@...gle.com,
        axelrasmussen@...gle.com, joelaf@...gle.com, minchan@...gle.com,
        jannh@...gle.com, shakeelb@...gle.com, tatashin@...gle.com,
        edumazet@...gle.com, gthelen@...gle.com, gurua@...gle.com,
        arjunroy@...gle.com, soheil@...gle.com, hughlynch@...gle.com,
        leewalsh@...gle.com, posk@...gle.com, linux-mm@...ck.org,
        linux-arm-kernel@...ts.infradead.org,
        linuxppc-dev@...ts.ozlabs.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH 28/41] mm: introduce lock_vma_under_rcu to be used from
 arch-specific code

On Tue, Jan 17, 2023 at 7:47 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 09-01-23 12:53:23, Suren Baghdasaryan wrote:
> > Introduce lock_vma_under_rcu function to lookup and lock a VMA during
> > page fault handling. When VMA is not found, can't be locked or changes
> > after being locked, the function returns NULL. The lookup is performed
> > under RCU protection to prevent the found VMA from being destroyed before
> > the VMA lock is acquired. VMA lock statistics are updated according to
> > the results.
> > For now only anonymous VMAs can be searched this way. In other cases the
> > function returns NULL.
>
> Could you describe why only anonymous vmas are handled at this stage and
> what (roughly) has to be done to support other vmas? lock_vma_under_rcu
> doesn't seem to have any anonymous vma specific requirements AFAICS.

TBH I haven't spent too much time looking into file-backed page faults
yet but a couple of tasks I can think of are:
- Ensure that all vma->vm_ops->fault() handlers do not rely on
mmap_lock being read-locked;
- vma->vm_file freeing like VMA freeing will need to be done after RCU
grace period since page fault handlers use it. This will require some
caution because simply adding it into __vm_area_free() called via
call_rcu()  will cause corresponding fops->release() to be called
asynchronously. I had to solve this issue with out-of-tree SPF
implementation when asynchronously called snd_pcm_release() was
problematic.

I'm sure I'm missing more potential issues and maybe Matthew and
Michel can pinpoint more things to resolve here?

>
> Also isn't lock_vma_under_rcu effectively find_read_lock_vma? Not that
> the naming is really the most important part but the rcu locking is
> internal to the function so why should we spread this implementation
> detail to the world...

I wanted the name to indicate that the lookup is done with no locks
held. But I'm open to suggestions.

>
> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > ---
> >  include/linux/mm.h |  3 +++
> >  mm/memory.c        | 51 ++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 54 insertions(+)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index c464fc8a514c..d0fddf6a1de9 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -687,6 +687,9 @@ static inline void vma_assert_no_reader(struct vm_area_struct *vma)
> >                     vma);
> >  }
> >
> > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> > +                                       unsigned long address);
> > +
> >  #else /* CONFIG_PER_VMA_LOCK */
> >
> >  static inline void vma_init_lock(struct vm_area_struct *vma) {}
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9ece18548db1..a658e26d965d 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5242,6 +5242,57 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> >  }
> >  EXPORT_SYMBOL_GPL(handle_mm_fault);
> >
> > +#ifdef CONFIG_PER_VMA_LOCK
> > +/*
> > + * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be
> > + * stable and not isolated. If the VMA is not found or is being modified the
> > + * function returns NULL.
> > + */
> > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> > +                                       unsigned long address)
> > +{
> > +     MA_STATE(mas, &mm->mm_mt, address, address);
> > +     struct vm_area_struct *vma, *validate;
> > +
> > +     rcu_read_lock();
> > +     vma = mas_walk(&mas);
> > +retry:
> > +     if (!vma)
> > +             goto inval;
> > +
> > +     /* Only anonymous vmas are supported for now */
> > +     if (!vma_is_anonymous(vma))
> > +             goto inval;
> > +
> > +     if (!vma_read_trylock(vma))
> > +             goto inval;
> > +
> > +     /* Check since vm_start/vm_end might change before we lock the VMA */
> > +     if (unlikely(address < vma->vm_start || address >= vma->vm_end)) {
> > +             vma_read_unlock(vma);
> > +             goto inval;
> > +     }
> > +
> > +     /* Check if the VMA got isolated after we found it */
> > +     mas.index = address;
> > +     validate = mas_walk(&mas);
> > +     if (validate != vma) {
> > +             vma_read_unlock(vma);
> > +             count_vm_vma_lock_event(VMA_LOCK_MISS);
> > +             /* The area was replaced with another one. */
> > +             vma = validate;
> > +             goto retry;
> > +     }
> > +
> > +     rcu_read_unlock();
> > +     return vma;
> > +inval:
> > +     rcu_read_unlock();
> > +     count_vm_vma_lock_event(VMA_LOCK_ABORT);
> > +     return NULL;
> > +}
> > +#endif /* CONFIG_PER_VMA_LOCK */
> > +
> >  #ifndef __PAGETABLE_P4D_FOLDED
> >  /*
> >   * Allocate p4d page table.
> > --
> > 2.39.0
>
> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ