lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 25 Aug 2020 10:06:40 +0800 From: Boqun Feng <boqun.feng@...il.com> To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com> Cc: Peter Zijlstra <peterz@...radead.org>, linux-kernel <linux-kernel@...r.kernel.org>, Will Deacon <will@...nel.org>, paulmck <paulmck@...nel.org>, Andy Lutomirski <luto@...capital.net>, Andrew Morton <akpm@...ux-foundation.org>, Alan Stern <stern@...land.harvard.edu>, Nicholas Piggin <npiggin@...il.com> Subject: Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) On Mon, Aug 24, 2020 at 11:27:49AM -0400, Mathieu Desnoyers wrote: > ----- On Aug 16, 2020, at 11:29 AM, Boqun Feng boqun.feng@...il.com wrote: > > > On Fri, Aug 14, 2020 at 12:43:57PM -0400, Mathieu Desnoyers wrote: > >> Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm > >> to allow the effect of membarrier(2) to apply to kthreads accessing > >> user-space memory as well. > >> > >> Given that no prior kthread use this guarantee and that it only affects > >> kthreads, adding this guarantee does not affect user-space ABI. > >> > >> Refine the check in membarrier_global_expedited to exclude runqueues > >> running the idle thread rather than all kthreads from the IPI cpumask. > >> > >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com> > >> Cc: Peter Zijlstra (Intel) <peterz@...radead.org> > >> Cc: Will Deacon <will@...nel.org> > >> Cc: Paul E. McKenney <paulmck@...nel.org> > >> Cc: Nicholas Piggin <npiggin@...il.com> > >> Cc: Andy Lutomirski <luto@...capital.net> > >> Cc: Andrew Morton <akpm@...ux-foundation.org> > >> --- > >> Changes since v1: > >> - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ), > >> - Use smp_mb__after_spinlock rather than smp_mb after task_lock. > >> --- > >> kernel/kthread.c | 19 +++++++++++++++++++ > >> kernel/sched/idle.c | 1 + > >> kernel/sched/membarrier.c | 8 ++------ > >> 3 files changed, 22 insertions(+), 6 deletions(-) > >> > >> diff --git a/kernel/kthread.c b/kernel/kthread.c > >> index 3edaa380dc7b..77aaaa7bc8d9 100644 > >> --- a/kernel/kthread.c > >> +++ b/kernel/kthread.c > >> @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm) > >> finish_arch_post_lock_switch(); > >> #endif > >> > >> + /* > >> + * When a kthread starts operating on an address space, the loop > >> + * in membarrier_{private,global}_expedited() may not observe > >> + * that tsk->mm, and not issue an IPI. Membarrier requires a > >> + * memory barrier after storing to tsk->mm, before accessing > >> + * user-space memory. A full memory barrier for membarrier > >> + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by > >> + * mmdrop(), or explicitly with smp_mb(). > >> + */ > >> if (active_mm != mm) > >> mmdrop(active_mm); > >> + else > >> + smp_mb(); > > > > Similar question here: could smp_mb() guarantee the correctness of > > GLOBAL_EXPEDITED? Don't you need membarrier_switch_mm() here and in > > kthread_unuse_mm(), too? > > > > Am I miss something here? > > I think you have a good point there. Which brings me to wonder why we > don't have membarrier_switch_mm() when entering/leaving lazy TLB mode. > This means an IPI can be sent to a kthread even if it does not use an > mm, just because the membarrier state in the runqueue is that of the > active_mm. > > Thoughts ? > Right, I think we should also handle the percpu membarrier_state. The basic rule is whenever we change current->mm or current (i.e. rq->curr) itself, we need to update the percpu membarrier_state accordingly. Regards, Boqun > Thanks, > > Mathieu > > > > > Regards, > > Boqun > > > >> > >> to_kthread(tsk)->oldfs = force_uaccess_begin(); > >> } > >> @@ -1276,6 +1287,14 @@ void kthread_unuse_mm(struct mm_struct *mm) > >> force_uaccess_end(to_kthread(tsk)->oldfs); > >> > >> task_lock(tsk); > >> + /* > >> + * When a kthread stops operating on an address space, the loop > >> + * in membarrier_{private,global}_expedited() may not observe > >> + * that tsk->mm, and not issue an IPI. Membarrier requires a > >> + * memory barrier after accessing user-space memory, before > >> + * clearing tsk->mm. > >> + */ > >> + smp_mb__after_spinlock(); > >> sync_mm_rss(mm); > >> local_irq_disable(); > >> tsk->mm = NULL; > >> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > >> index 6bf34986f45c..3443ee8335d0 100644 > >> --- a/kernel/sched/idle.c > >> +++ b/kernel/sched/idle.c > >> @@ -341,6 +341,7 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) > >> WARN_ON_ONCE(!(current->flags & PF_KTHREAD)); > >> WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY)); > >> WARN_ON_ONCE(!duration_ns); > >> + WARN_ON_ONCE(current->mm); > >> > >> rcu_sleep_check(); > >> preempt_disable(); > >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c > >> index 168479a7d61b..8a294483074d 100644 > >> --- a/kernel/sched/membarrier.c > >> +++ b/kernel/sched/membarrier.c > >> @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void) > >> MEMBARRIER_STATE_GLOBAL_EXPEDITED)) > >> continue; > >> > >> - /* > >> - * Skip the CPU if it runs a kernel thread. The scheduler > >> - * leaves the prior task mm in place as an optimization when > >> - * scheduling a kthread. > >> - */ > >> + /* Skip the CPU if it runs the idle thread. */ > >> p = rcu_dereference(cpu_rq(cpu)->curr); > >> - if (p->flags & PF_KTHREAD) > >> + if (is_idle_task(p)) > >> continue; > >> > >> __cpumask_set_cpu(cpu, tmpmask); > >> -- > >> 2.11.0 > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com
Powered by blists - more mailing lists