lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpEt9SSrELZzfmcqJ7JL_nEzWGz-YE9GRUZTjU5unqQjQg@mail.gmail.com>
Date:   Tue, 10 May 2022 09:31:50 -0700
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        David Rientjes <rientjes@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <guro@...com>, Minchan Kim <minchan@...nel.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Christian Brauner <brauner@...nel.org>,
        Christoph Hellwig <hch@...radead.org>,
        Oleg Nesterov <oleg@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        Jann Horn <jannh@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Peter Xu <peterx@...hat.com>,
        John Hubbard <jhubbard@...dia.com>, shuah@...nel.org,
        LKML <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>, linux-kselftest@...r.kernel.org,
        kernel-team <kernel-team@...roid.com>
Subject: Re: [PATCH 2/3] mm: drop oom code from exit_mmap

On Tue, May 10, 2022 at 6:06 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 09-05-22 20:00:13, Suren Baghdasaryan wrote:
> > With the oom-killer being able to operate on locked pages, exit_mmap
> > does not need to ensure that oom_reap_task_mm is done before it can
> > proceed. Instead it can rely on mmap_lock write lock to prevent
> > oom-killer from operating on the vma tree while it's freeing page
> > tables. exit_mmap can hold mmap_lock read lock when unmapping vmas
> > and then take mmap_lock write lock before freeing page tables.
>
> The changelog is rather light on nasty details which might be good but
> for the sake of our future us let's be more verbose so that we do not
> have to reinvent the prior history each time we are looking into this
> code. I would go with something like this instead:
> "
> The primary reason to invoke the oom reaper from the exit_mmap path used
> to be a prevention of an excessive oom killing if the oom victim exit
> races with the oom reaper (see 212925802454 ("mm: oom: let oom_reap_task
> and exit_mmap run concurrently") for more details. The invocation has
> moved around since then because of the interaction with the munlock
> logic but the underlying reason has remained the same (see 27ae357fa82b
> ("mm, oom: fix concurrent munlock and oom reaper unmap, v3").
>
> Munlock code is no longer a problem since a213e5cf71cb ("mm/munlock:
> delete munlock_vma_pages_all(), allow oomreap") and there shouldn't be
> any blocking operation before the memory is unmapped by exit_mmap so
> the oom reaper invocation can be dropped. The unmapping part can be done
> with the non-exclusive mmap_sem and the exclusive one is only required
> when page tables are freed.
>
> Remove the oom_reaper from exit_mmap which will make the code easier to
> read. This is really unlikely to make any observable difference although
> some microbenchmarks could benefit from one less branch that needs to be
> evaluated even though it almost never is true.
> "

Looks great! Thanks for collecting all the history. Will update the description.

>
> One minor comment below. Other than that \o/ this is finally going away.
> I strongly suspect that the history of this code is a nice example about how
> over optimizing code can cause more harm than good.
>
> Acked-by: Michal Hocko <mhocko@...e.com>

Thanks.

>
> Thanks!
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > ---
> >  include/linux/oom.h |  2 --
> >  mm/mmap.c           | 25 ++++++-------------------
> >  mm/oom_kill.c       |  2 +-
> >  3 files changed, 7 insertions(+), 22 deletions(-)
> >
> [...]
> > @@ -3138,6 +3121,10 @@ void exit_mmap(struct mm_struct *mm)
> >       /* update_hiwater_rss(mm) here? but nobody should be looking */
> >       /* Use -1 here to ensure all VMAs in the mm are unmapped */
> >       unmap_vmas(&tlb, vma, 0, -1);
> > +     mmap_read_unlock(mm);
> > +     /* Set MMF_OOM_SKIP to disregard this mm from further consideration.*/
> > +     set_bit(MMF_OOM_SKIP, &mm->flags);
>
> I think that it would be slightly more readable to add an empty line
> above and below of this. Also the comment would be more helpful if it
> explaind what the further consideration actually means. I would go with
>
>         /*
>          * Set MMF_OOM_SKIP to hide this task from the oom killer/reaper
>          * because the memory has been already freed. Do not bother
>          * checking mm_is_oom_victim because setting a bit
>          * unconditionally is just cheaper.
>          */
>

Ack.

> > +     mmap_write_lock(mm);
> >       free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
> >       tlb_finish_mmu(&tlb);
>
> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ