[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHRRsUMNHp2H3UiB4EZbe9CXTVcAC+oOR1dscENjp1Jbw@mail.gmail.com>
Date: Tue, 17 Jan 2023 17:06:57 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Michal Hocko <mhocko@...e.com>
Cc: akpm@...ux-foundation.org, michel@...pinasse.org,
jglisse@...gle.com, vbabka@...e.cz, hannes@...xchg.org,
mgorman@...hsingularity.net, dave@...olabs.net,
willy@...radead.org, liam.howlett@...cle.com, peterz@...radead.org,
ldufour@...ux.ibm.com, laurent.dufour@...ibm.com,
paulmck@...nel.org, luto@...nel.org, songliubraving@...com,
peterx@...hat.com, david@...hat.com, dhowells@...hat.com,
hughd@...gle.com, bigeasy@...utronix.de, kent.overstreet@...ux.dev,
punit.agrawal@...edance.com, lstoakes@...il.com,
peterjung1337@...il.com, rientjes@...gle.com,
axelrasmussen@...gle.com, joelaf@...gle.com, minchan@...gle.com,
jannh@...gle.com, shakeelb@...gle.com, tatashin@...gle.com,
edumazet@...gle.com, gthelen@...gle.com, gurua@...gle.com,
arjunroy@...gle.com, soheil@...gle.com, hughlynch@...gle.com,
leewalsh@...gle.com, posk@...gle.com, linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, x86@...nel.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH 28/41] mm: introduce lock_vma_under_rcu to be used from
arch-specific code
On Tue, Jan 17, 2023 at 7:47 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 09-01-23 12:53:23, Suren Baghdasaryan wrote:
> > Introduce lock_vma_under_rcu function to lookup and lock a VMA during
> > page fault handling. When VMA is not found, can't be locked or changes
> > after being locked, the function returns NULL. The lookup is performed
> > under RCU protection to prevent the found VMA from being destroyed before
> > the VMA lock is acquired. VMA lock statistics are updated according to
> > the results.
> > For now only anonymous VMAs can be searched this way. In other cases the
> > function returns NULL.
>
> Could you describe why only anonymous vmas are handled at this stage and
> what (roughly) has to be done to support other vmas? lock_vma_under_rcu
> doesn't seem to have any anonymous vma specific requirements AFAICS.
TBH I haven't spent too much time looking into file-backed page faults
yet but a couple of tasks I can think of are:
- Ensure that all vma->vm_ops->fault() handlers do not rely on
mmap_lock being read-locked;
- vma->vm_file freeing like VMA freeing will need to be done after RCU
grace period since page fault handlers use it. This will require some
caution because simply adding it into __vm_area_free() called via
call_rcu() will cause corresponding fops->release() to be called
asynchronously. I had to solve this issue with out-of-tree SPF
implementation when asynchronously called snd_pcm_release() was
problematic.
I'm sure I'm missing more potential issues and maybe Matthew and
Michel can pinpoint more things to resolve here?
>
> Also isn't lock_vma_under_rcu effectively find_read_lock_vma? Not that
> the naming is really the most important part but the rcu locking is
> internal to the function so why should we spread this implementation
> detail to the world...
I wanted the name to indicate that the lookup is done with no locks
held. But I'm open to suggestions.
>
> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > ---
> > include/linux/mm.h | 3 +++
> > mm/memory.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 54 insertions(+)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index c464fc8a514c..d0fddf6a1de9 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -687,6 +687,9 @@ static inline void vma_assert_no_reader(struct vm_area_struct *vma)
> > vma);
> > }
> >
> > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> > + unsigned long address);
> > +
> > #else /* CONFIG_PER_VMA_LOCK */
> >
> > static inline void vma_init_lock(struct vm_area_struct *vma) {}
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9ece18548db1..a658e26d965d 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5242,6 +5242,57 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> > }
> > EXPORT_SYMBOL_GPL(handle_mm_fault);
> >
> > +#ifdef CONFIG_PER_VMA_LOCK
> > +/*
> > + * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be
> > + * stable and not isolated. If the VMA is not found or is being modified the
> > + * function returns NULL.
> > + */
> > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> > + unsigned long address)
> > +{
> > + MA_STATE(mas, &mm->mm_mt, address, address);
> > + struct vm_area_struct *vma, *validate;
> > +
> > + rcu_read_lock();
> > + vma = mas_walk(&mas);
> > +retry:
> > + if (!vma)
> > + goto inval;
> > +
> > + /* Only anonymous vmas are supported for now */
> > + if (!vma_is_anonymous(vma))
> > + goto inval;
> > +
> > + if (!vma_read_trylock(vma))
> > + goto inval;
> > +
> > + /* Check since vm_start/vm_end might change before we lock the VMA */
> > + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) {
> > + vma_read_unlock(vma);
> > + goto inval;
> > + }
> > +
> > + /* Check if the VMA got isolated after we found it */
> > + mas.index = address;
> > + validate = mas_walk(&mas);
> > + if (validate != vma) {
> > + vma_read_unlock(vma);
> > + count_vm_vma_lock_event(VMA_LOCK_MISS);
> > + /* The area was replaced with another one. */
> > + vma = validate;
> > + goto retry;
> > + }
> > +
> > + rcu_read_unlock();
> > + return vma;
> > +inval:
> > + rcu_read_unlock();
> > + count_vm_vma_lock_event(VMA_LOCK_ABORT);
> > + return NULL;
> > +}
> > +#endif /* CONFIG_PER_VMA_LOCK */
> > +
> > #ifndef __PAGETABLE_P4D_FOLDED
> > /*
> > * Allocate p4d page table.
> > --
> > 2.39.0
>
> --
> Michal Hocko
> SUSE Labs
Powered by blists - more mailing lists