lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <1599040088.z7acx6fvvf.astroid@bobo.none>
Date:   Wed, 02 Sep 2020 19:48:48 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     linux-mm@...ck.org, Michael Ellerman <mpe@...erman.id.au>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Jens Axboe <axboe@...nel.dk>,
        "David S. Miller" <davem@...emloft.net>,
        linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 4/4] powerpc/64s/radix: Fix mm_cpumask trimming race vs
 kthread_use_mm

Excerpts from Michael Ellerman's message of September 1, 2020 10:00 pm:
> Nicholas Piggin <npiggin@...il.com> writes:
>> Commit 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of
>> single-threaded mm_cpumask") added a mechanism to trim the mm_cpumask of
>> a process under certain conditions. One of the assumptions is that
>> mm_users would not be incremented via a reference outside the process
>> context with mmget_not_zero() then go on to kthread_use_mm() via that
>> reference.
>>
>> That invariant was broken by io_uring code (see previous sparc64 fix),
>> but I'll point Fixes: to the original powerpc commit because we are
>> changing that assumption going forward, so this will make backports
>> match up.
>>
>> Fix this by no longer relying on that assumption, but by having each CPU
>> check the mm is not being used, and clearing their own bit from the mask
>> if it's okay. This fix relies on commit 38cf307c1f20 ("mm: fix
>> kthread_use_mm() vs TLB invalidate") to disable irqs over the mm switch,
>> and ARCH_WANT_IRQS_OFF_ACTIVATE_MM to be enabled.
> 
> You could use:
> 
> Depends-on: 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB invalidate")

Good idea I wil.

>> Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
>> Signed-off-by: Nicholas Piggin <npiggin@...il.com>
>> ---
>>  arch/powerpc/include/asm/tlb.h       | 13 -------------
>>  arch/powerpc/mm/book3s64/radix_tlb.c | 23 ++++++++++++++++-------
>>  2 files changed, 16 insertions(+), 20 deletions(-)
> 
> One minor nit below if you're respinning anyway.
> 
> You know this stuff better than me, but I still reviewed it and it seems
> good to me.
> 
> Reviewed-by: Michael Ellerman <mpe@...erman.id.au>

Thanks.

> 
>> diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
>> index fbc6f3002f23..d97f061fecac 100644
>> --- a/arch/powerpc/include/asm/tlb.h
>> +++ b/arch/powerpc/include/asm/tlb.h
>> @@ -66,19 +66,6 @@ static inline int mm_is_thread_local(struct mm_struct *mm)
>>  		return false;
>>  	return cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm));
>>  }
>> -static inline void mm_reset_thread_local(struct mm_struct *mm)
>> -{
>> -	WARN_ON(atomic_read(&mm->context.copros) > 0);
>> -	/*
>> -	 * It's possible for mm_access to take a reference on mm_users to
>> -	 * access the remote mm from another thread, but it's not allowed
>> -	 * to set mm_cpumask, so mm_users may be > 1 here.
>> -	 */
>> -	WARN_ON(current->mm != mm);
>> -	atomic_set(&mm->context.active_cpus, 1);
>> -	cpumask_clear(mm_cpumask(mm));
>> -	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
>> -}
>>  #else /* CONFIG_PPC_BOOK3S_64 */
>>  static inline int mm_is_thread_local(struct mm_struct *mm)
>>  {
>> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
>> index 0d233763441f..a421a0e3f930 100644
>> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
>> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
>> @@ -645,19 +645,29 @@ static void do_exit_flush_lazy_tlb(void *arg)
>>  	struct mm_struct *mm = arg;
>>  	unsigned long pid = mm->context.id;
>>  
>> +	/*
>> +	 * A kthread could have done a mmget_not_zero() after the flushing CPU
>> +	 * checked mm_users == 1, and be in the process of kthread_use_mm when
>                                 ^
>                                 in mm_is_singlethreaded()
> 
> Adding that reference would help join the dots for a new reader I think.

Yes you're right I can change that.

Thanks,
Nick

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ