lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190222161942.GA12288@cmpxchg.org>
Date:   Fri, 22 Feb 2019 11:19:42 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Junil Lee <junil0814.lee@....com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        willy@...radead.org, pasha.tatashin@...cle.com,
        kirill.shutemov@...ux.intel.com, jrdr.linux@...il.com,
        dan.j.williams@...el.com, alexander.h.duyck@...ux.intel.com,
        andreyknvl@...gle.com, arunks@...eaurora.org,
        keith.busch@...el.com, guro@...com, rientjes@...gle.com,
        penguin-kernel@...ove.sakura.ne.jp, shakeelb@...gle.com,
        yuzhoujian@...ichuxing.com
Subject: Re: [PATCH] mm, oom: OOM killer use rss size without shmem

On Fri, Feb 22, 2019 at 08:10:01AM +0100, Michal Hocko wrote:
> On Fri 22-02-19 13:37:33, Junil Lee wrote:
> > The oom killer use get_mm_rss() function to estimate how free memory
> > will be reclaimed when the oom killer select victim task.
> > 
> > However, the returned rss size by get_mm_rss() function was changed from
> > "mm, shmem: add internal shmem resident memory accounting" commit.
> > This commit makes the get_mm_rss() return size including SHMEM pages.
> 
> This was actually the case even before eca56ff906bdd because SHMEM was
> just accounted to MM_FILEPAGES so this commit hasn't changed much
> really.
> 
> Besides that we cannot really rule out SHMEM pages simply. They are
> backing MAP_ANON|MAP_SHARED which might be unmapped and freed during the
> oom victim exit. Moreover this is essentially the same as file backed
> pages or even MAP_PRIVATE|MAP_ANON pages. Bothe can be pinned by other
> processes e.g. via private pages via CoW mappings and file pages by
> filesystem or simply mlocked by another process. So this really gross
> evaluation will never be perfect. We would basically have to do exact
> calculation of the freeable memory of each process and that is just not
> feasible.
> 
> That being said, I do not think the patch is an improvement in that
> direction. It just turnes one fuzzy evaluation by another that even
> misses a lot of memory potentially.

You make good points.

I think it's also worth noting that while the OOM killer is ultimately
about freeing memory, the victim algorithm is not about finding the
*optimal* amount of memory to free, but to kill the thing that is most
likely to have put the system into trouble. We're not going for
killing the smallest tasks until we're barely back over the line and
operational again, but instead we're finding the biggest offender to
stop the most likely source of unsustainable allocations. That's why
our metric is called "badness score", and not "freeable" or similar.

So even if a good chunk of the biggest task are tmpfs pages that
aren't necessarily freed upon kill, from a heuristics POV it's still
the best candidate to kill.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ