[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200721140623.4e8ecc6ef5d5ff42115d68fc@linux-foundation.org>
Date: Tue, 21 Jul 2020 14:06:23 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, luto@...capital.net, axboe@...nel.dk,
keescook@...omium.org, torvalds@...ux-foundation.org,
jannh@...gle.com, will@...nel.org, hch@....de, npiggin@...il.com,
mathieu.desnoyers@...icios.com
Subject: Re: [PATCH v3] mm: Fix kthread_use_mm() vs TLB invalidate
On Tue, 21 Jul 2020 17:41:06 +0200 Peter Zijlstra <peterz@...radead.org> wrote:
>
> For SMP systems using IPI based TLB invalidation, looking at
> current->active_mm is entirely reasonable. This then presents the
> following race condition:
>
>
> CPU0 CPU1
>
> flush_tlb_mm(mm) use_mm(mm)
> <send-IPI>
> tsk->active_mm = mm;
> <IPI>
> if (tsk->active_mm == mm)
> // flush TLBs
> </IPI>
> switch_mm(old_mm,mm,tsk);
>
>
> Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
> because the IPI lands before we actually switched.
>
> Avoid this by disabling IRQs across changing ->active_mm and
> switch_mm().
>
> [ There are all sorts of reasons this might be harmless for various
> architecture specific reasons, but best not leave the door open at
> all. ]
Can we give the -stable maintainers (and others) more explanation of
why they might choose to merge this?
> ...
>
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -1241,13 +1241,15 @@ void kthread_use_mm(struct mm_struct *mm)
> WARN_ON_ONCE(tsk->mm);
>
> task_lock(tsk);
> + local_irq_disable();
A bare local_irq_disable() is one of those "what the heck is this
protecting" things. It's the new lock_kernel().
So a little comment will help readers to understand why we did it.
Something like this?
--- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate-fix
+++ a/kernel/kthread.c
@@ -1239,6 +1239,7 @@ void kthread_use_mm(struct mm_struct *mm
WARN_ON_ONCE(tsk->mm);
task_lock(tsk);
+ /* Hold off tlb flush IPIs while switching mm's */
local_irq_disable();
active_mm = tsk->active_mm;
if (active_mm != mm) {
_
> active_mm = tsk->active_mm;
> if (active_mm != mm) {
> mmgrab(mm);
> tsk->active_mm = mm;
> }
> tsk->mm = mm;
> - switch_mm(active_mm, mm, tsk);
> + switch_mm_irqs_off(active_mm, mm, tsk);
> + local_irq_enable();
> task_unlock(tsk);
> #ifdef finish_arch_post_lock_switch
> finish_arch_post_lock_switch();
>
> ...
>
Powered by blists - more mailing lists