linux-kernel - Re: mm allocation failure and hang when running xfstests generic/269 on xfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170302143441.GL1404@dhcp22.suse.cz>
Date:   Thu, 2 Mar 2017 15:34:41 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Brian Foster <bfoster@...hat.com>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Xiong Zhou <xzhou@...hat.com>,
        Christoph Hellwig <hch@...radead.org>,
        linux-xfs@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: mm allocation failure and hang when running xfstests generic/269
 on xfs

On Thu 02-03-17 09:23:15, Brian Foster wrote:
> On Thu, Mar 02, 2017 at 02:50:01PM +0100, Michal Hocko wrote:
> > On Thu 02-03-17 08:41:58, Brian Foster wrote:
> > > On Thu, Mar 02, 2017 at 02:27:55PM +0100, Michal Hocko wrote:
> > [...]
> > > > I see your argument about being in sync with other kmem helpers but
> > > > those are bit different because regular page/slab allocators allow never
> > > > fail semantic (even though this is mostly ignored by those helpers which
> > > > implement their own retries but that is a different topic).
> > > > 
> > > 
> > > ... but what I'm trying to understand here is whether this failure
> > > scenario is specific to vmalloc() or whether the other kmem_*()
> > > functions are susceptible to the same problem. For example, suppose we
> > > replaced this kmem_zalloc_greedy() call with a kmem_zalloc(PAGE_SIZE,
> > > KM_SLEEP) call. Could we hit the same problem if the process is killed?
> > 
> > Well, kmem_zalloc uses kmalloc which can also fail when we are out of
> > memory but in that case we can expect the OOM killer releasing some
> > memory which would allow us to make a forward progress on the next
> > retry. So essentially retrying around kmalloc is much more safe in this
> > regard. Failing vmalloc might be permanent because there is no vmalloc
> > space to allocate from or much more likely due to already mentioned
> > patch. So vmalloc is different, really.
> 
> Right.. that's why I'm asking. So it's technically possible but highly
> unlikely due to the different failure characteristics. That seems
> reasonable to me, then. 
> 
> To be clear, do we understand what causes the vzalloc() failure to be
> effectively permanent in this specific reproducer? I know you mention
> above that we could be out of vmalloc space, but that doesn't clarify
> whether there are other potential failure paths or then what this has to
> do with the fact that the process was killed. Does the pending signal
> cause the subsequent failures or are you saying that there is some other
> root cause of the failure, this process would effectively be spinning
> here anyways, and we're just noticing it because it's trying to exit?

In this particular case it is fatal_signal_pending that causes the
permanent failure. This check has been added to prevent from complete
memory reserves depletion on OOM when a killed task has a free ticket to
reserves and vmalloc requests can be really large. In this case there
was no OOM killer going on but fsstress has SIGKILL pending for other
reason. Most probably as a result of the group_exit when all threads
are killed (see zap_process). I could have turn fatal_signal_pending
into tsk_is_oom_victim which would be less likely to hit but in
principle fatal_signal_pending should be better because we do want to
bail out when the process is existing as soon as possible.

What I really wanted to say is that there are other possible permanent
failure paths in vmalloc AFAICS. They are much less probable but they
still exist.

Does that make more sense now?
-- 
Michal Hocko
SUSE Labs