lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 4 Apr 2023 15:06:09 +0800
From:   "yebin (H)" <yebin10@...wei.com>
To:     Yury Norov <yury.norov@...il.com>, Ye Bin <yebin@...weicloud.com>
CC:     <dennis@...nel.org>, <tj@...nel.org>, <cl@...ux.com>,
        <linux-mm@...ck.org>, <andriy.shevchenko@...ux.intel.com>,
        <linux@...musvillemoes.dk>, <linux-kernel@...r.kernel.org>,
        <dchinner@...hat.com>
Subject: Re: [PATCH 2/2] lib/percpu_counter: fix dying cpu compare race



On 2023/4/4 10:50, Yury Norov wrote:
> On Tue, Apr 04, 2023 at 09:42:06AM +0800, Ye Bin wrote:
>> From: Ye Bin <yebin10@...wei.com>
>>
>> In commit 8b57b11cca88 ("pcpcntrs: fix dying cpu summation race") a race
>> condition between a cpu dying and percpu_counter_sum() iterating online CPUs
>> was identified.
>> Acctually, there's the same race condition between a cpu dying and
>> __percpu_counter_compare(). Here, use 'num_online_cpus()' for quick judgment.
>> But 'num_online_cpus()' will be decreased before call 'percpu_counter_cpu_dead()',
>> then maybe return incorrect result.
>> To solve above issue, also need to add dying CPUs count when do quick judgment
>> in __percpu_counter_compare().
> Not sure I completely understood the race you are describing. All CPU
> accounting is protected with percpu_counters_lock. Is it a real race
> that you've faced, or hypothetical? If it's real, can you share stack
> traces?
>   
>> Signed-off-by: Ye Bin <yebin10@...wei.com>
>> ---
>>   lib/percpu_counter.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
>> index 5004463c4f9f..399840cb0012 100644
>> --- a/lib/percpu_counter.c
>> +++ b/lib/percpu_counter.c
>> @@ -227,6 +227,15 @@ static int percpu_counter_cpu_dead(unsigned int cpu)
>>   	return 0;
>>   }
>>   
>> +static __always_inline unsigned int num_count_cpus(void)
> This doesn't look like a good name. Maybe num_offline_cpus?
num_count_cpus() include online CPUs and offline CPUs, use 
num_offline_cpus() doesn't seem appropriate either.
>
>> +{
>> +#ifdef CONFIG_HOTPLUG_CPU
Perhaps we need to add a memory barrier to setting and reading 
__num_dying_cpu.

+	return (num_online_cpus() + num_dying_cpus());

>                 ^                                    ^
>           'return' is not a function. Braces are not needed
>
> Generally speaking, a sequence of atomic operations is not an atomic
> operation, so the above doesn't look correct. I don't think that it
> would be possible to implement raceless accounting based on 2 separate
> counters.
>
> Most probably, you'd have to use the same approach as in 8b57b11cca88:
>
>          lock();
>          for_each_cpu_or(cpu, cpu_online_mask, cpu_dying_mask)
>                  cnt++;
>          unlock();
>
> And if so, I'd suggest to implement cpumask_weight_or() for that.
>
>> +#else
>> +	return num_online_cpus();
>> +#endif
>> +}
>> +
>>   /*
>>    * Compare counter against given value.
>>    * Return 1 if greater, 0 if equal and -1 if less
>> @@ -237,7 +246,7 @@ int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch)
>>   
>>   	count = percpu_counter_read(fbc);
>>   	/* Check to see if rough count will be sufficient for comparison */
>> -	if (abs(count - rhs) > (batch * num_online_cpus())) {
>> +	if (abs(count - rhs) > (batch * num_count_cpus())) {
>>   		if (count > rhs)
>>   			return 1;
>>   		else
>> -- 
>> 2.31.1
> .
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ