[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201711031959.CCC21876.tQFLHOOFVMJSFO@I-love.SAKURA.ne.jp>
Date: Fri, 3 Nov 2017 19:59:38 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: wei.w.wang@...el.com, virtio-dev@...ts.oasis-open.org,
linux-kernel@...r.kernel.org, qemu-devel@...gnu.org,
virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
linux-mm@...ck.org, mst@...hat.com, mhocko@...nel.org,
akpm@...ux-foundation.org, mawilcox@...rosoft.com
Cc: david@...hat.com, cornelia.huck@...ibm.com,
mgorman@...hsingularity.net, aarcange@...hat.com,
amit.shah@...hat.com, pbonzini@...hat.com, willy@...radead.org,
liliang.opensource@...il.com, yang.zhang.wz@...il.com,
quan.xu@...yun.com
Subject: Re: [PATCH v17 3/6] mm/balloon_compaction.c: split balloon page allocation and enqueue
Wei Wang wrote:
> Here's a detailed analysis of the deadlock by Tetsuo Handa:
>
> In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
> serialize against fill_balloon(). But in fill_balloon(),
> alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
> called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
> implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
> is specified, this allocation attempt might indirectly depend on somebody
> else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
> __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
> virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
> out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
> mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
> will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if
> leak_balloon() is called from out_of_memory().
Please drop "Thus, do not wait for vb->balloon_lock mutex if leak_balloon()
is called from out_of_memory()." part. This is not what this patch will do.
>
> Thread1 Thread2
> fill_balloon()
> takes a balloon_lock
> balloon_page_enqueue()
> alloc_page(GFP_HIGHUSER_MOVABLE)
> direct reclaim (__GFP_FS context) takes a fs lock
> waits for that fs lock alloc_page(GFP_NOFS)
> __alloc_pages_may_oom()
> takes the oom_lock
> out_of_memory()
> blocking_notifier_call_chain()
> leak_balloon()
> tries to take that
> balloon_lock and deadlocks
Powered by blists - more mailing lists