linux-kernel - Re: [PATCH] mm, memory_hotplug: do not back off draining pcp free pages from kworker context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170831053342.fo7x4hnhicxikme4@dhcp22.suse.cz>
Date:   Thu, 31 Aug 2017 07:33:42 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Tejun Heo <tj@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...e.de>, linux-mm@...ck.org,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm, memory_hotplug: do not back off draining pcp free
 pages from kworker context

On Tue 29-08-17 13:28:23, Michal Hocko wrote:
> On Tue 29-08-17 20:20:39, Tetsuo Handa wrote:
> > On 2017/08/29 7:33, Andrew Morton wrote:
> > > On Mon, 28 Aug 2017 11:33:41 +0200 Michal Hocko <mhocko@...nel.org> wrote:
> > > 
> > >> drain_all_pages backs off when called from a kworker context since
> > >> 0ccce3b924212 ("mm, page_alloc: drain per-cpu pages from workqueue
> > >> context") because the original IPI based pcp draining has been replaced
> > >> by a WQ based one and the check wanted to prevent from recursion and
> > >> inter workers dependencies. This has made some sense at the time
> > >> because the system WQ has been used and one worker holding the lock
> > >> could be blocked while waiting for new workers to emerge which can be a
> > >> problem under OOM conditions.
> > >>
> > >> Since then ce612879ddc7 ("mm: move pcp and lru-pcp draining into single
> > >> wq") has moved draining to a dedicated (mm_percpu_wq) WQ with a rescuer
> > >> so we shouldn't depend on any other WQ activity to make a forward
> > >> progress so calling drain_all_pages from a worker context is safe as
> > >> long as this doesn't happen from mm_percpu_wq itself which is not the
> > >> case because all workers are required to _not_ depend on any MM locks.
> > >>
> > >> Why is this a problem in the first place? ACPI driven memory hot-remove
> > >> (acpi_device_hotplug) is executed from the worker context. We end
> > >> up calling __offline_pages to free all the pages and that requires
> > >> both lru_add_drain_all_cpuslocked and drain_all_pages to do their job
> > >> otherwise we can have dangling pages on pcp lists and fail the offline
> > >> operation (__test_page_isolated_in_pageblock would see a page with 0
> > >> ref. count but without PageBuddy set).
> > >>
> > >> Fix the issue by removing the worker check in drain_all_pages.
> > >> lru_add_drain_all_cpuslocked doesn't have this restriction so it works
> > >> as expected.
> > >>
> > >> Fixes: 0ccce3b924212 ("mm, page_alloc: drain per-cpu pages from workqueue context")
> > >> Signed-off-by: Michal Hocko <mhocko@...e.com>
> > > 
> > > No cc:stable?
> > > 
> > 
> > Michal, are you sure that this patch does not cause deadlock?
> > 
> > As shown in "[PATCH] mm: Use WQ_HIGHPRI for mm_percpu_wq." thread, currently work
> > items on mm_percpu_wq seem to be blocked by other work items not on mm_percpu_wq.
> 
> But we have a rescuer so we should make a forward progress eventually.
> Or am I missing something. Tejun, could you have a look please?

ping... I would really appreaciate if you could double check my thinking
Tejun. This is a tricky area and I would like to prevent further subtle
issues here.
-- 
Michal Hocko
SUSE Labs