[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHQ501HBBQjm2Qvc7dgr93uH-=bgukXHJzpkJRk4_x7SA@mail.gmail.com>
Date: Thu, 24 Jul 2025 14:52:47 +0000
From: Suren Baghdasaryan <surenb@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Jann Horn <jannh@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Pedro Falcato <pfalcato@...e.de>, Linux-MM <linux-mm@...ck.org>,
kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] hard-to-hit mm_struct UAF due to insufficiently careful
vma_refcount_put() wrt SLAB_TYPESAFE_BY_RCU
On Thu, Jul 24, 2025 at 2:45 PM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 7/24/25 16:29, Suren Baghdasaryan wrote:
> > On Thu, Jul 24, 2025 at 3:53 AM Lorenzo Stoakes
> > <lorenzo.stoakes@...cle.com> wrote:
> >>
> >> On Thu, Jul 24, 2025 at 10:38:06AM +0200, Vlastimil Babka wrote:
> >> > On 7/24/25 04:30, Suren Baghdasaryan wrote:
> >> > > So, I think vma_refcount_put() can mmgrab(vma->mm) before calling
> >> > > __refcount_dec_and_test(), to stabilize that mm and then mmdrop()
> >> > > after it calls rcuwait_wake_up(). What do you think about this
> >> > > approach, folks?
> >> >
> >> > Yeah except it would be wasteful to do for all vma_refcount_put(). Should be
> >> > enough to have this version (as Jann suggested) for inval_end_read: part of
> >> > lock_vma_under_rcu.
> >
> > Yes, definitely.
> >
> >> > I think we need it also for the vma_refcount_put() done
> >> > in vma_start_read() when we fail the seqcount check? I think in that case
> >> > the same thing can be happening too, just with different race windows?
> >
> > Yes.
> >
> >> >
> >> > Also as Jann suggested, maybe it's not great (or even safe) to perform
> >> > __mmdrop() under rcu? And maybe some vma_start_read() users are even more
> >> > restricted? Maybe then we'd need to make __mmdrop_delayed() not RT-only, and
> >> > use that.
> >>
> >> Agreed that doing this under RCU seems unwise.
> >>
> >> I know PTL relies on this as well as zap PTE page table reclaim, likely these
> >> wouldn't interact with an mm going away (you'd hope!) but it seems unwise to
> >> play around with assumptions here.
> >
> > __mmdrop_delayed() is a viable option but I would hate adding
> > mm->delayed_drop for !CONFIG_PREEMPT_RT just for this one case.
> > Alternatively, we don't need to be in the rcu read section when we
> > call vma_end_read() inside lock_vma_under_rcu(). We refcounted the
> > vma, so it's locked and stable, no need for RCU at that point. We can
> > move rcu_read_unlock() before vma_end_read(). However that's not the
>
> Seems correct.
>
> > case with the vma_refcount_put() inside vma_start_read(). We could
> > change vma_start_read() semantics so that it drops rcu_read_lock
> > before it returns.
>
> Looks like there's no other users of vma_start_read() than
> lock_vma_under_rcu() itself that we need to care about, so seems fine.
>
> > That way we could do vma_refcount_put() after
> > rcu_read_unlock() in both places. The retry case in
> > lock_vma_under_rcu() would have to reacquire rcu_read_lock() but that
> > retry is not the usual path, so should not affect overall locking
> > performance. What do you think about this alternative?
>
> Sounds feasible.
>
> I guess at that point it would be cleaner to combine vma_start_read() with
> mas_walk() into a new function that does both and starts with
> rcu_read_lock() itself so we don't have the ugly scheme of entering under
> rcu lock and returning without it?
Yes, I was thinking along the same lines and maybe we can also move
the vma->mm != mm checks there too, as Jann suggested. I'll see which
way looks better and if I get to a satisfactory state, will post a
patch.
Thanks!
>
Powered by blists - more mailing lists