[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpEcvuYu5OhXMVz5g4OK+-a_jnF9MNtu17VP5V23r2oWtw@mail.gmail.com>
Date: Thu, 21 Oct 2021 22:23:09 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...nel.org>, Michal Hocko <mhocko@...e.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Johannes Weiner <hannes@...xchg.org>,
Roman Gushchin <guro@...com>, Rik van Riel <riel@...riel.com>,
Minchan Kim <minchan@...nel.org>,
Christian Brauner <christian@...uner.io>,
Christoph Hellwig <hch@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
David Hildenbrand <david@...hat.com>,
Jann Horn <jannh@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Christian Brauner <christian.brauner@...ntu.com>,
Florian Weimer <fweimer@...hat.com>,
Jan Engelhardt <jengelh@...i.de>,
Linux API <linux-api@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
kernel-team <kernel-team@...roid.com>
Subject: Re: [PATCH 1/1] mm: prevent a race between process_mrelease and exit_mmap
On Thu, Oct 21, 2021 at 7:25 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Thu, 21 Oct 2021 18:46:58 -0700 Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> > Race between process_mrelease and exit_mmap, where free_pgtables is
> > called while __oom_reap_task_mm is in progress, leads to kernel crash
> > during pte_offset_map_lock call. oom-reaper avoids this race by setting
> > MMF_OOM_VICTIM flag and causing exit_mmap to take and release
> > mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock.
> > Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to
> > fix this race, however that would be considered a hack. Fix this race
> > by elevating mm->mm_users and preventing exit_mmap from executing until
> > process_mrelease is finished. Patch slightly refactors the code to adapt
> > for a possible mmget_not_zero failure.
> > This fix has considerable negative impact on process_mrelease performance
> > and will likely need later optimization.
>
> Has the impact been quantified?
A ball-park figure for a large process (6GB) it takes 4x times longer
for process_mrelease to exit.
>
> And where's the added cost happening? The changes all look quite
> lightweight?
I think it's caused by the fact that exit_mmap and all other cleanup
routines happening on the last mmput are postponed until
process_mrelease finishes __oom_reap_task_mm and drops mm->mm_users. I
suspect all that cleanup is happening at the end of process_mrelease
now and that might be contributing to the regression. I didn't have
time yet to fully understand all the reasons for that regression but
wanted to fix the crash first. Will proceed with more investigation
and hopefully with a quick fix for the lost performance.
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@...roid.com.
>
Powered by blists - more mailing lists