lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 11 Apr 2016 15:43:21 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	linux-mm@...ck.org, rientjes@...gle.com, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, oleg@...hat.com
Subject: Re: [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip
 regular OOM killer path

On Mon 11-04-16 22:26:09, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Sat 09-04-16 13:39:30, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > On Fri 08-04-16 20:19:28, Tetsuo Handa wrote:
> > > > > I looked at next-20160408 but I again came to think that we should remove
> > > > > these shortcuts (something like a patch shown bottom).
> > > >
> > > > feel free to send the patch with the full description. But I would
> > > > really encourage you to check the history to learn why those have been
> > > > added and describe why those concerns are not valid/important anymore.
> > > 
> > > I believe that past discussions and decisions about current code are too
> > > optimistic because they did not take 'The "too small to fail" memory-
> > > allocation rule' problem into account.
> > 
> > In most cases they were driven by _real_ usecases though. And that
> > is what matters. Theoretically possible issues which happen under
> > crazy workloads which are DoSing the machine already are not something
> > to optimize for. Sure we should try to cope with them as gracefully
> > as possible, no questions about that, but we should try hard not to
> > reintroduce previous issues during _sensible_ workloads.
> 
> I'm not requesting you to optimize for crazy workloads. None of my
> customers intentionally put crazy workloads, but they are getting silent
> hangups and I'm suspecting that something went wrong with memory management.

There are many other possible reasons for thses symptoms. Have you
actually seen any _evidence_ they the hang they are seeing is due to
oom deadlock, though. A single crash dump or consistent sysrq output
which would point that direction.

> But there is no evidence because memory management subsystem remains silent.
> You call my testcases DoS, but there is no evidence that my customers
> are not hitting the same problem my testcases found.

This is really impossible to comment on.

> I'm suggesting you to at least emit diagnostic messages when something went
> wrong. That is what kmallocwd is for. And if you do not want to emit
> diagnostic messages, I'm fine with timeout based approach.

I am all for more diagnostic but what you were proposing was so heavy
weight it doesn't really seem worth it.

Anyway yet again this is getting largely off-topic...
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ