lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 18 Aug 2016 20:01:04 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Sonny Rao <sonnyrao@...omium.org>
Cc:     Jann Horn <jann@...jh.net>,
        Robert Foss <robert.foss@...labora.com>, corbet@....net,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Konstantin Khlebnikov <koct9i@...il.com>,
        Hugh Dickins <hughd@...gle.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        Minchan Kim <minchan@...nel.org>,
        John Stultz <john.stultz@...aro.org>,
        ross.zwisler@...ux.intel.com, jmarchan@...hat.com,
        Johannes Weiner <hannes@...xchg.org>,
        Kees Cook <keescook@...omium.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Cyrill Gorcunov <gorcunov@...nvz.org>,
        Robin Humble <plaguedbypenguins@...il.com>,
        David Rientjes <rientjes@...gle.com>,
        eric.engestrom@...tec.com, Janis Danisevskis <jdanis@...gle.com>,
        calvinowens@...com, Alexey Dobriyan <adobriyan@...il.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        ldufour@...ux.vnet.ibm.com, linux-doc@...r.kernel.org,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ben Zhang <benzh@...omium.org>,
        Bryan Freed <bfreed@...omium.org>,
        Filipe Brandenburger <filbranden@...omium.org>,
        Mateusz Guzik <mguzik@...hat.com>
Subject: Re: [PACTH v2 0/3] Implement /proc/<pid>/totmaps

On Thu 18-08-16 10:47:57, Sonny Rao wrote:
> On Thu, Aug 18, 2016 at 12:44 AM, Michal Hocko <mhocko@...nel.org> wrote:
> > On Wed 17-08-16 11:57:56, Sonny Rao wrote:
[...]
> >> 2) User space OOM handling -- we'd rather do a more graceful shutdown
> >> than let the kernel's OOM killer activate and need to gather this
> >> information and we'd like to be able to get this information to make
> >> the decision much faster than 400ms
> >
> > Global OOM handling in userspace is really dubious if you ask me. I
> > understand you want something better than SIGKILL and in fact this is
> > already possible with memory cgroup controller (btw. memcg will give
> > you a cheap access to rss, amount of shared, swapped out memory as
> > well). Anyway if you are getting close to the OOM your system will most
> > probably be really busy and chances are that also reading your new file
> > will take much more time. I am also not quite sure how is pss useful for
> > oom decisions.
> 
> I mentioned it before, but based on experience RSS just isn't good
> enough -- there's too much sharing going on in our use case to make
> the correct decision based on RSS.  If RSS were good enough, simply
> put, this patch wouldn't exist.

But that doesn't answer my question, I am afraid. So how exactly do you
use pss for oom decisions?

> So even with memcg I think we'd have the same problem?

memcg will give you instant anon, shared counters for all processes in
the memcg.

> > Don't take me wrong, /proc/<pid>/totmaps might be suitable for your
> > specific usecase but so far I haven't heard any sound argument for it to
> > be generally usable. It is true that smaps is unnecessarily costly but
> > at least I can see some room for improvements. A simple patch I've
> > posted cut the formatting overhead by 7%. Maybe we can do more.
> 
> It seems like a general problem that if you want these values the
> existing kernel interface can be very expensive, so it would be
> generally usable by any application which wants a per process PSS,
> private data, dirty data or swap value.

yes this is really unfortunate. And if at all possible we should address
that. Precise values require the expensive rmap walk. We can introduce
some caching to help that. But so far it seems the biggest overhead is
to simply format the output and that should be addressed before any new
proc file is added.

> I mentioned two use cases, but I guess I don't understand the comment
> about why it's not usable by other use cases.

I might be wrong here but a use of pss is quite limited and I do not
remember anybody asking for large optimizations in that area. I still do
not understand your use cases properly so I am quite skeptical about a
general usefulness of a new file.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ