[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220310222206.dttvvlgfqysrcl2s@revolver>
Date: Thu, 10 Mar 2022 22:22:12 +0000
From: Liam Howlett <liam.howlett@...cle.com>
To: Suren Baghdasaryan <surenb@...gle.com>
CC: Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"mhocko@...nel.org" <mhocko@...nel.org>,
"mhocko@...e.com" <mhocko@...e.com>,
"shy828301@...il.com" <shy828301@...il.com>,
"rientjes@...gle.com" <rientjes@...gle.com>,
"hannes@...xchg.org" <hannes@...xchg.org>,
"guro@...com" <guro@...com>, "riel@...riel.com" <riel@...riel.com>,
"minchan@...nel.org" <minchan@...nel.org>,
"kirill@...temov.name" <kirill@...temov.name>,
"aarcange@...hat.com" <aarcange@...hat.com>,
"brauner@...nel.org" <brauner@...nel.org>,
"christian@...uner.io" <christian@...uner.io>,
"hch@...radead.org" <hch@...radead.org>,
"oleg@...hat.com" <oleg@...hat.com>,
"david@...hat.com" <david@...hat.com>,
"jannh@...gle.com" <jannh@...gle.com>,
"shakeelb@...gle.com" <shakeelb@...gle.com>,
"luto@...nel.org" <luto@...nel.org>,
"christian.brauner@...ntu.com" <christian.brauner@...ntu.com>,
"fweimer@...hat.com" <fweimer@...hat.com>,
"jengelh@...i.de" <jengelh@...i.de>,
"timmurray@...gle.com" <timmurray@...gle.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kernel-team@...roid.com" <kernel-team@...roid.com>,
"syzbot+2ccf63a4bd07cf39cab0@...kaller.appspotmail.com"
<syzbot+2ccf63a4bd07cf39cab0@...kaller.appspotmail.com>
Subject: Re: [PATCH 1/1] mm: fix use-after-free bug when mm->mmap is reused
after being freed
* Suren Baghdasaryan <surenb@...gle.com> [220310 11:28]:
> On Thu, Mar 10, 2022 at 7:55 AM Liam Howlett <liam.howlett@...cle.com> wrote:
> >
> > * Suren Baghdasaryan <surenb@...gle.com> [220225 00:51]:
> > > On Thu, Feb 24, 2022 at 8:23 PM Matthew Wilcox <willy@...radead.org> wrote:
> > > >
> > > > On Thu, Feb 24, 2022 at 08:18:59PM -0800, Andrew Morton wrote:
> > > > > On Tue, 15 Feb 2022 12:19:22 -0800 Suren Baghdasaryan <surenb@...gle.com> wrote:
> > > > >
> > > > > > After exit_mmap frees all vmas in the mm, mm->mmap needs to be reset,
> > > > > > otherwise it points to a vma that was freed and when reused leads to
> > > > > > a use-after-free bug.
> > > > > >
> > > > > > ...
> > > > > >
> > > > > > --- a/mm/mmap.c
> > > > > > +++ b/mm/mmap.c
> > > > > > @@ -3186,6 +3186,7 @@ void exit_mmap(struct mm_struct *mm)
> > > > > > vma = remove_vma(vma);
> > > > > > cond_resched();
> > > > > > }
> > > > > > + mm->mmap = NULL;
> > > > > > mmap_write_unlock(mm);
> > > > > > vm_unacct_memory(nr_accounted);
> > > > > > }
> > > > >
> > > > > After the Maple tree patches, mm_struct.mmap doesn't exist. So I'll
> > > > > revert this fix as part of merging the maple-tree parts of linux-next.
> > > > > I'll be sending this fix to Linus this week.
> > > > >
> > > > > All of which means that the thusly-resolved Maple tree patches might
> > > > > reintroduce this use-after-free bug.
> > > >
> > > > I don't think so? The problem is that VMAs are (currently) part of
> > > > two data structures -- the rbtree and the linked list. remove_vma()
> > > > only removes VMAs from the rbtree; it doesn't set mm->mmap to NULL.
> > > >
> > > > With maple tree, the linked list goes away. remove_vma() removes VMAs
> > > > from the maple tree. So anyone looking to iterate over all VMAs has to
> > > > go and look in the maple tree for them ... and there's nothing there.
> > >
> > > Yes, I think you are right. With maple trees we don't need this fix.
> >
> >
> > Yes, this is correct. The maple tree removes the entire linked list...
> > but since the mm is unstable in the exit_mmap(), I had added the
> > destruction of the maple tree there. Maybe this is the wrong place to
> > be destroying the tree tracking the VMAs (althought this patch partially
> > destroys the VMA tracking linked list), but it brought my attention to
> > the race that this patch solves and the process_mrelease() function.
> > Couldn't this be avoided by using mmget_not_zero() instead of mmgrab()
> > in process_mrelease()?
>
> That's what we were doing before [1]. That unfortunately has a problem
> of process_mrelease possibly calling the last mmput and being blocked
> on IO completion in exit_aio.
Oh, I see. Thanks.
> The race between exit_mmap and
> process_mrelease is solved by using mmap_lock.
I think an important part of the race fix isn't just the lock holding
but the setting of the start of the linked list to NULL above. That
means the code in __oom_reap_task_mm() via process_mrelease() will
continue to execute but iterate for zero VMAs.
> I think by destroying the maple tree in exit_mmap before the
> mmap_write_unlock call, you keep things working and functionality
> intact. Is there any reason this can't be done?
Yes, unfortunately. If MMF_OOM_SKIP is not set, then process_mrelease()
will call __oom_reap_task_mm() which will get a null pointer dereference
or a use after free in the vma iterator as it tries to iterate the maple
tree. I think the best plan is to set MMF_OOM_SKIP unconditionally
when the mmap_write_lock() is acquired. Doing so will ensure nothing
will try to gain memory by reaping a task that no longer has memory to
yield - or at least won't shortly. If we do use MMF_OOM_SKIP in such a
way, then I think it is safe to quickly drop the lock?
Also, should process_mrelease() be setting MMF_OOM_VICTIM on this mm?
It would enable the fast path on a race with exit_mmap() - thought that
may not be desirable?
>
> [1] ba535c1caf3ee78a ("mm/oom_kill: allow process_mrelease to run
> under mmap_lock protection")
>
> > That would ensure we aren't stepping on an
> > exit_mmap() and potentially the locking change in exit_mmap() wouldn't
> > be needed either? Logically, I view this as process_mrelease() having
> > issue with the fact that the mmaps are no longer stable in tear down
> > regardless of the data structure that is used.
> >
> > Thanks,
> > Liam
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@...roid.com.
> >
Powered by blists - more mailing lists