lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 1 Mar 2021 13:24:39 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Yang Shi <shy828301@...il.com>
Cc:     Johannes Weiner <hannes@...xchg.org>, Roman Gushchin <guro@...com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Jonathan Corbet <corbet@....net>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] doc: memcontrol: add description for oom_kill

On Fri 26-02-21 11:19:51, Yang Shi wrote:
> On Fri, Feb 26, 2021 at 8:42 AM Yang Shi <shy828301@...il.com> wrote:
> >
> > On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko <mhocko@...e.com> wrote:
> > >
> > > On Thu 25-02-21 18:12:54, Yang Shi wrote:
> > > > When debugging an oom issue, I found the oom_kill counter of memcg is
> > > > confusing.  At the first glance without checking document, I thought it
> > > > just counts for memcg oom, but it turns out it counts both global and
> > > > memcg oom.
> > >
> > > Yes, this is the case indeed. The point of the counter was to count oom
> > > victims from the memcg rather than matching that to the source of the
> > > oom. Rememeber that this could have been a memcg oom up in the
> > > hierarchy as well. Counting victims on the oom origin could be equally
> >
> > Yes, it is updated hierarchically on v2, but not on v1. I'm supposed
> > this is because v1 may work in non-hierarchcal mode? If this is the
> > only reason we may be able to remove this to get aligned with v2 since
> > non-hierarchal mode is no longer supported.
> 
> BTW, having the counter recorded hierarchically may help out one of
> our usecases. We want to monitor the oom_kill for some services, but
> systemd would wipe out the cgroup if the service is oom killed then
> restart the service from scratch (it means create a brand new cgroup
> with the same name). So this systemd behavior makes the counter
> useless if it is not recorded hierarchically.

Just to make sure I understand correctly. You have a setup where memcg
for a service has a hard limit configured and it is destroyed when oom
happens inside that memcg. A new instance is created at the same place
of the hierarchy with a new memcg. Your problem is that the oom killed
memcg will not be recorded in its parent oom event and the information
will get lost with the torn down memcg. Correct?

If yes then how do you tell which of the child cgroup was killed from
the parent counter? Or is there only a single child?

Anyway, cgroup v2 will offer the hierarchical behavior. Do you have any
strong reasons that you cannot use v2?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ