[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190717053521.GC16284@dhcp22.suse.cz>
Date: Wed, 17 Jul 2019 07:35:21 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Qian Cai <cai@....pw>
Cc: Yang Shi <yang.shi@...ux.alibaba.com>, catalin.marinas@....com,
dvyukov@...gle.com, rientjes@...gle.com, willy@...radead.org,
akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
On Tue 16-07-19 16:28:21, Qian Cai wrote:
> On Tue, 2019-07-16 at 22:07 +0200, Michal Hocko wrote:
> > On Tue 16-07-19 15:21:17, Qian Cai wrote:
> > [...]
> > > Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that
> > > succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object
> > > allocations.
> >
> > Well, not really. Because low order allocations with
> > __GFP_DIRECT_RECLAIM basically never fail (they keep retrying) even
> > without GFP_NOFAIL because that flag is actually to guarantee no
> > failure. And for high order allocations the nofail mode is actively
> > harmful. It completely changes the behavior of a system. A light costly
> > order workload could put the system on knees and completely change the
> > behavior. I am not really convinced this is a good behavior of a
> > debugging feature TBH.
>
> While I agree your general observation about GFP_NOFAIL, I am afraid the
> discussion here is about "struct kmemleak_object" slab cache from a single call
> site create_object().
OK, this makes it less harmfull because the order aspect doesn't really
apply here. But still stretches the NOFAIL semantic a lot. The kmemleak
essentially asks for NORETRY | NOFAIL which means no oom but retry for
ever semantic for sleeping allocations. This can still lead to
unexpected side effects. Just consider a call site that holds locks and
now cannot make any forward progress without anybody else hitting the
oom killer for example. As noted in other email, I would simply drop
NORETRY flag as well and live with the fact that the oom killer can be
invoked. It still wouldn't solve the NOWAIT contexts but those need a
proper solution anyway.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists