[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f602dcb3-34f7-4f04-a5e5-ec055c5e7fd4@lucifer.local>
Date: Tue, 14 Jan 2025 11:48:41 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: akpm@...ux-foundation.org, peterz@...radead.org, willy@...radead.org,
liam.howlett@...cle.com, david.laight.linux@...il.com, mhocko@...e.com,
vbabka@...e.cz, hannes@...xchg.org, mjguzik@...il.com,
oliver.sang@...el.com, mgorman@...hsingularity.net, david@...hat.com,
peterx@...hat.com, oleg@...hat.com, dave@...olabs.net,
paulmck@...nel.org, brauner@...nel.org, dhowells@...hat.com,
hdanton@...a.com, hughd@...gle.com, lokeshgidra@...gle.com,
minchan@...gle.com, jannh@...gle.com, shakeel.butt@...ux.dev,
souravpanda@...gle.com, pasha.tatashin@...een.com,
klarasmodin@...il.com, richard.weiyang@...il.com, corbet@....net,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v9 07/17] mm: allow
vma_start_read_locked/vma_start_read_locked_nested to fail
On Mon, Jan 13, 2025 at 09:53:01AM -0800, Suren Baghdasaryan wrote:
> On Mon, Jan 13, 2025 at 7:25 AM Lorenzo Stoakes
> <lorenzo.stoakes@...cle.com> wrote:
> >
> > On Fri, Jan 10, 2025 at 08:25:54PM -0800, Suren Baghdasaryan wrote:
> > > With upcoming replacement of vm_lock with vm_refcnt, we need to handle a
> > > possibility of vma_start_read_locked/vma_start_read_locked_nested failing
> > > due to refcount overflow. Prepare for such possibility by changing these
> > > APIs and adjusting their users.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > > Acked-by: Vlastimil Babka <vbabka@...e.cz>
> > > Cc: Lokesh Gidra <lokeshgidra@...gle.com>
> > > ---
> > > include/linux/mm.h | 6 ++++--
> > > mm/userfaultfd.c | 18 +++++++++++++-----
> > > 2 files changed, 17 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > index 2f805f1a0176..cbb4e3dbbaed 100644
> > > --- a/include/linux/mm.h
> > > +++ b/include/linux/mm.h
> > > @@ -747,10 +747,11 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
> > > * not be used in such cases because it might fail due to mm_lock_seq overflow.
> > > * This functionality is used to obtain vma read lock and drop the mmap read lock.
> > > */
> > > -static inline void vma_start_read_locked_nested(struct vm_area_struct *vma, int subclass)
> > > +static inline bool vma_start_read_locked_nested(struct vm_area_struct *vma, int subclass)
> > > {
> > > mmap_assert_locked(vma->vm_mm);
> > > down_read_nested(&vma->vm_lock.lock, subclass);
> > > + return true;
> > > }
> > >
> > > /*
> > > @@ -759,10 +760,11 @@ static inline void vma_start_read_locked_nested(struct vm_area_struct *vma, int
> > > * not be used in such cases because it might fail due to mm_lock_seq overflow.
> > > * This functionality is used to obtain vma read lock and drop the mmap read lock.
> > > */
> > > -static inline void vma_start_read_locked(struct vm_area_struct *vma)
> > > +static inline bool vma_start_read_locked(struct vm_area_struct *vma)
> > > {
> > > mmap_assert_locked(vma->vm_mm);
> > > down_read(&vma->vm_lock.lock);
> > > + return true;
> > > }
> > >
> > > static inline void vma_end_read(struct vm_area_struct *vma)
> > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > > index 4527c385935b..411a663932c4 100644
> > > --- a/mm/userfaultfd.c
> > > +++ b/mm/userfaultfd.c
> > > @@ -85,7 +85,8 @@ static struct vm_area_struct *uffd_lock_vma(struct mm_struct *mm,
> > > mmap_read_lock(mm);
> > > vma = find_vma_and_prepare_anon(mm, address);
> > > if (!IS_ERR(vma))
> > > - vma_start_read_locked(vma);
> > > + if (!vma_start_read_locked(vma))
> > > + vma = ERR_PTR(-EAGAIN);
> >
> > Nit but this kind of reads a bit weirdly now:
> >
> > if (!IS_ERR(vma))
> > if (!vma_start_read_locked(vma))
> > vma = ERR_PTR(-EAGAIN);
> >
> > Wouldn't this be nicer as:
> >
> > if (!IS_ERR(vma) && !vma_start_read_locked(vma))
> > vma = ERR_PTR(-EAGAIN);
> >
> > On the other hand, this embeds an action in an expression, but then it sort of
> > still looks weird.
> >
> > if (!IS_ERR(vma)) {
> > bool ok = vma_start_read_locked(vma);
> >
> > if (!ok)
> > vma = ERR_PTR(-EAGAIN);
> > }
> >
> > This makes me wonder, now yes, we are truly bikeshedding, sorry, but maybe we
> > could just have vma_start_read_locked return a VMA pointer that could be an
> > error?
> >
> > Then this becomes:
> >
> > if (!IS_ERR(vma))
> > vma = vma_start_read_locked(vma);
>
> No, I think it would be wrong for vma_start_read_locked() to always
> return EAGAIN when it can't lock the vma. The error code here is
> context-dependent, so while EAGAIN is the right thing here, it might
> not work for other future users.
Ack, makes sense.
But it'd be nice to clean this up so it isn't this arrow-shaped-code
thing. I mean obviously this is subjective and sorry to bikeshed this late
in a series... but :)
Are you with:
if (!IS_ERR(vma)) {
bool ok = vma_start_read_locked(vma);
if (!ok)
vma = ERR_PTR(-EAGAIN);
}
?
I think this reads better.
Sorry to be a pain! :)
>
> >
> > >
> > > mmap_read_unlock(mm);
> > > return vma;
> > > @@ -1483,10 +1484,17 @@ static int uffd_move_lock(struct mm_struct *mm,
> > > mmap_read_lock(mm);
> > > err = find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, src_vmap);
> > > if (!err) {
> > > - vma_start_read_locked(*dst_vmap);
> > > - if (*dst_vmap != *src_vmap)
> > > - vma_start_read_locked_nested(*src_vmap,
> > > - SINGLE_DEPTH_NESTING);
> > > + if (vma_start_read_locked(*dst_vmap)) {
> > > + if (*dst_vmap != *src_vmap) {
> > > + if (!vma_start_read_locked_nested(*src_vmap,
> > > + SINGLE_DEPTH_NESTING)) {
> > > + vma_end_read(*dst_vmap);
> >
> > Hmm, why do we end read if the lock failed here but not above?
>
> We have successfully done vma_start_read_locked(dst_vmap) (we locked
> dest vma) but we failed to do vma_start_read_locked_nested(src_vmap)
> (we could not lock src vma). So we should undo the dest vma locking.
> Does that clarify the logic?
Ahh right makes sense. Maybe a quick cheeky comment to that effect here too?
>
> >
> > > + err = -EAGAIN;
> > > + }
> > > + }
> > > + } else {
> > > + err = -EAGAIN;
> > > + }
> > > }
> >
> > This whole block is really ugly now, this really needs refactoring.
> >
> > How about (on assumption the vma_end_read() is correct):
> >
> >
> > err = find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, src_vmap);
> > if (err)
> > goto out;
> >
> > if (!vma_start_read_locked(*dst_vmap)) {
> > err = -EAGAIN;
> > goto out;
> > }
> >
> > /* Nothing further to do. */
> > if (*dst_vmap == *src_vmap)
> > goto out;
> >
> > if (!vma_start_read_locked_nested(*src_vmap,
> > SINGLE_DEPTH_NESTING)) {
> > vma_end_read(*dst_vmap);
> > err = -EAGAIN;
> > }
> >
> > out:
> > mmap_read_unlock(mm);
> > return err;
> > }
>
> Ok, that looks good to me. Will change this way.
> Thanks!
>
Thanks!
> >
> > > mmap_read_unlock(mm);
> > > return err;
> > > --
> > > 2.47.1.613.gc27f4b7a9f-goog
> > >
Powered by blists - more mailing lists