[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1607131019300.31769@file01.intranet.prod.int.rdu2.redhat.com>
Date: Wed, 13 Jul 2016 10:21:07 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
cc: Jerome Marchand <jmarchan@...hat.com>,
Ondrej Kozina <okozina@...hat.com>,
Stanislav Kozina <skozina@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: System freezes after OOM
On Wed, 13 Jul 2016, Michal Hocko wrote:
> On Wed 13-07-16 10:35:01, Jerome Marchand wrote:
> > On 07/13/2016 01:44 AM, Mikulas Patocka wrote:
> > > The problem of swapping to dm-crypt is this.
> > >
> > > The free memory goes low, kswapd decides that some page should be swapped
> > > out. However, when you swap to an ecrypted device, writeback of each page
> > > requires another page to hold the encrypted data. dm-crypt uses mempools
> > > for all its structures and pages, so that it can make forward progress
> > > even if there is no memory free. However, the mempool code first allocates
> > > from general memory allocator and resorts to the mempool only if the
> > > memory is below limit.
> > >
> > > So every attempt to swap out some page allocates another page.
> > >
> > > As long as swapping is in progress, the free memory is below the limit
> > > (because the swapping activity itself consumes any memory over the limit).
> > > And that triggered the OOM killer prematurely.
> >
> > There is a quite recent sysctl vm knob that I believe can help in this
> > case: watermark_scale_factor. If you increase this value, kswapd will
> > start paging out earlier, when there might still be enough free memory.
> >
> > Ondrej, have you tried to increase /proc/sys/vm/watermark_scale_factor?
>
> I suspect this would just change the timing or the real problem gets
> hidden.
I agree - tweaking some limits would just change the probability of the
bug without addressing the root cause.
We shouldn't tweak anything and just stick to Ondrej's scenario where he
reproduced the bug.
Mikulas
> --
> Michal Hocko
> SUSE Labs
>
Powered by blists - more mailing lists