linux-kernel - Re: [PATCH v19 3/7] xbitmap: add more operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5A35FF89.8040500@intel.com>
Date:   Sun, 17 Dec 2017 13:24:25 +0800
From:   Wei Wang <wei.w.wang@...el.com>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        willy@...radead.org
CC:     virtio-dev@...ts.oasis-open.org, linux-kernel@...r.kernel.org,
        qemu-devel@...gnu.org, virtualization@...ts.linux-foundation.org,
        kvm@...r.kernel.org, linux-mm@...ck.org, mst@...hat.com,
        mhocko@...nel.org, akpm@...ux-foundation.org,
        mawilcox@...rosoft.com, david@...hat.com, cornelia.huck@...ibm.com,
        mgorman@...hsingularity.net, aarcange@...hat.com,
        amit.shah@...hat.com, pbonzini@...hat.com,
        liliang.opensource@...il.com, yang.zhang.wz@...il.com,
        quan.xu@...yun.com, nilal@...hat.com, riel@...hat.com
Subject: Re: [PATCH v19 3/7] xbitmap: add more operations

On 12/16/2017 07:28 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>> On 12/16/2017 02:42 AM, Matthew Wilcox wrote:
>>> On Tue, Dec 12, 2017 at 07:55:55PM +0800, Wei Wang wrote:
>>>> +int xb_preload_and_set_bit(struct xb *xb, unsigned long bit, gfp_t gfp);
>>> I'm struggling to understand when one would use this.  The xb_ API
>>> requires you to handle your own locking.  But specifying GFP flags
>>> here implies you can sleep.  So ... um ... there's no locking?
>> In the regular use cases, people would do xb_preload() before taking the
>> lock, and the xb_set/clear within the lock.
>>
>> In the virtio-balloon usage, we have a large number of bits to set with
>> the balloon_lock being held (we're not unlocking for each bit), so we
>> used the above wrapper to do preload and set within the balloon_lock,
>> and passed in GFP_NOWAIT to avoid sleeping. Probably we can change to
>> put this wrapper implementation to virtio-balloon, since it would not be
>> useful for the regular cases.
> GFP_NOWAIT is chosen in order not to try to OOM-kill something, isn't it?

Yes, I think that's right the issue we are discussing here (also 
discussed in the deadlock patch before): Suppose we use a sleep-able 
flag GFP_KERNEL, which gets the caller (fill_balloon or leak_balloon) 
into sleep with balloon_lock being held, and the memory reclaiming from 
GFP_KERNEL would fall into the OOM code path which first invokes the 
oom_notify-->leak_balloon to release some balloon memory, which needs to 
take the balloon_lock that is being held by the task who is sleeping.

So, using GFP_NOWAIT avoids sleeping to get memory through directly 
memory reclaiming, which could fall into that OOM code path that needs 
to take the balloon_lock.


> But passing GFP_NOWAIT means that we can handle allocation failure. There is
> no need to use preload approach when we can handle allocation failure.

I think the reason we need xb_preload is because radix tree insertion 
needs the memory being preallocated already (it couldn't suffer from 
memory failure during the process of inserting, probably because 
handling the failure there isn't easy, Matthew may know the backstory of 
this)

So, I think we can handle the memory failure with xb_preload, which 
stops going into the radix tree APIs, but shouldn't call radix tree APIs 
without the related memory preallocated.

Best,
Wei