lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 7 Feb 2017 12:54:48 +0100 From: Vlastimil Babka <vbabka@...e.cz> To: Michal Hocko <mhocko@...nel.org>, Mel Gorman <mgorman@...hsingularity.net> Cc: Dmitry Vyukov <dvyukov@...gle.com>, Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>, "linux-mm@...ck.org" <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>, syzkaller <syzkaller@...glegroups.com>, Andrew Morton <akpm@...ux-foundation.org> Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc On 02/07/2017 12:43 PM, Michal Hocko wrote: >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3b93879990fd..7af165d308c4 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -2342,7 +2342,14 @@ void drain_local_pages(struct zone *zone) >> >> static void drain_local_pages_wq(struct work_struct *work) >> { >> + /* >> + * Ordinarily a drain operation is bound to a CPU but may be unbound >> + * after a CPU hotplug operation so it's necessary to disable >> + * preemption for the drain to stabilise the CPU ID. >> + */ >> + preempt_disable(); >> drain_local_pages(NULL); >> + preempt_enable_no_resched(); >> } >> >> /* > [...] >> @@ -6711,7 +6714,16 @@ static int page_alloc_cpu_dead(unsigned int cpu) >> { >> >> lru_add_drain_cpu(cpu); >> + >> + /* >> + * A per-cpu drain via a workqueue from drain_all_pages can be >> + * rescheduled onto an unrelated CPU. That allows the hotplug >> + * operation and the drain to potentially race on the same >> + * CPU. Serialise hotplug versus drain using pcpu_drain_mutex >> + */ >> + mutex_lock(&pcpu_drain_mutex); >> drain_pages(cpu); >> + mutex_unlock(&pcpu_drain_mutex); > > You cannot put sleepable lock inside the preempt disbaled section... > We can make it a spinlock right? Could we do flush_work() with a spinlock? Sounds bad too. Maybe we could just use the fact that the whole drain happens with disabled irq's and obtain the current cpu under that protection?
Powered by blists - more mailing lists