linux-kernel - Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131118165110.GE32623@dhcp22.suse.cz>
Date:	Mon, 18 Nov 2013 17:51:10 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	cgroups@...r.kernel.org
Subject: Re: [patch 1/2] mm, memcg: avoid oom notification when current needs
 access to memory reserves

On Mon 18-11-13 10:41:15, Johannes Weiner wrote:
> On Thu, Nov 14, 2013 at 03:26:51PM -0800, David Rientjes wrote:
> > When current has a pending SIGKILL or is already in the exit path, it
> > only needs access to memory reserves to fully exit.  In that sense, the
> > memcg is not actually oom for current, it simply needs to bypass memory
> > charges to exit and free its memory, which is guarantee itself that
> > memory will be freed.
> > 
> > We only want to notify userspace for actionable oom conditions where
> > something needs to be done (and all oom handling can already be deferred
> > to userspace through this method by disabling the memcg oom killer with
> > memory.oom_control), not simply when a memcg has reached its limit, which
> > would actually have to happen before memcg reclaim actually frees memory
> > for charges.
> 
> Even though the situation may not require a kill, the user still wants
> to know that the memory hard limit was breached and the isolation
> broken in order to prevent a kill.  We just came really close and the

You can observe that you are getting into troubles from fail counter
already. The usability without more reclaim statistics is a bit
questionable but you get a rough impression that something is wrong at
least.

> fact that current is exiting is coincidental.  Not everybody is having
> OOM situations on a frequent basis and they might want to know when
> they are redlining the system and that the same workload might blow up
> the next time it's run.

I am just concerned that signaling temporal OOM conditions which do not
require any OOM killer action (user or kernel space) might be confusing.
Userspace would have harder times to tell whether any action is required
or not.

> The emergency reserves are there to prevent the system from
> deadlocking.  We only dip into them to avert a more imminent disaster
> but we are no longer in good shape at this point.  But by not even
> announcing this situation to userspace anymore you are making this the
> new baseline and declaring that everything is fine when the system is
> already clutching at straws.
> 
> I maintain that we should signal OOM when our healthy and
> always-available options are exhausted.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/