linux-kernel - Re: System freezes after OOM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LRH.2.02.1607151730430.21114@file01.intranet.prod.int.rdu2.redhat.com>
Date:	Fri, 15 Jul 2016 17:39:55 -0400 (EDT)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	David Rientjes <rientjes@...gle.com>
cc:	Michal Hocko <mhocko@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	Ondrej Kozina <okozina@...hat.com>,
	Jerome Marchand <jmarchan@...hat.com>,
	Stanislav Kozina <skozina@...hat.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com
Subject: Re: System freezes after OOM



On Fri, 15 Jul 2016, David Rientjes wrote:

> On Fri, 15 Jul 2016, Mikulas Patocka wrote:
> 
> > > There is no guarantee that _anything_ can return memory to the mempool,
> > 
> > You misunderstand mempools if you make such claims.
> > 
> > There is in fact guarantee that objects will be returned to mempool. In 
> > the past I reviewed device mapper thoroughly to make sure that it can make 
> > forward progress even if there is no available memory.
> > 
> > I don't know what should I tell you if you keep on repeating the same 
> > false claim over and over again. Should I explain mempool oprerations to 
> > you in detail? Or will you find it on your own?
> > 
> 
> If you are talking about patches you're proposing for 4.8 or any guarantee 
> of memory freeing that the oom killer/reaper will provide in 4.8, that's 
> fine.  However, the state of the 4.7 kernel is the same as it was when I 
> fixed this issue that timed out hundreds of our machines and is 
> contradicted by that evidence.  Our machines time out after two hours with 
> the oom victim looping forever in mempool_alloc(), so if there was a 

And what about the oom reaper? It should have freed all victim's pages 
even if the victim is looping in mempool_alloc. Why the oom reaper didn't 
free up memory?

> guarantee that elements would be returned in a completely livelocked 
> kernel in 4.7 or earlier kernels, that would not have been the case.  I 

And what kind of targets do you use in device mapper in the configuration 
that livelocked? Do you use some custom google-developed drivers?

Please describe the whole stack of block I/O devices when this livelock 
happened.

Most device mapper drivers can really make forward progress when they are 
out of memory, so I'm interested what kind of configuration do you have.

> frankly don't care about your patch reviewing of dm mempool usage when 
> dm_request() livelocked our kernel.

If it livelocked, it is a bug in some underlying block driver, not a bug 
in mempool_alloc.

> Feel free to formally propose patches either for 4.7 or 4.8.

Mikulas