lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e2daa4d-6f96-4189-9d86-644614f895e1@redhat.com>
Date: Wed, 25 Dec 2024 11:04:43 -0500
From: Waiman Long <llong@...hat.com>
To: Bibo Mao <maobibo@...ngson.cn>, Ingo Molnar <mingo@...hat.com>,
 Will Deacon <will@...nel.org>
Cc: Boqun Feng <boqun.feng@...il.com>, linux-kernel@...r.kernel.org,
 Akira Yokosawa <akiyks@...il.com>
Subject: Re: [PATCH] locking/pvqspinlock: Use try_cmpxchg() in pv_unhash

On 12/23/24 2:47 AM, Bibo Mao wrote:
> We ported pv spinlock to old linux kernel on LoongArch platform, there is
What old kernel are you using? Race condition like that should be hard 
to reproduce. Do you hit this bug only once?
> error with some stress tests. The error report is something like this for
> short:
>   kernel BUG at kernel/locking/qspinlock_paravirt.h:261!
>   Oops - BUG[#1]:
>   CPU: 1 PID: 6613 Comm: pidof Not tainted 4.19.190+ #43
>   Hardware name: Loongson KVM, BIOS 0.0.0 02/06/2015
>      ra: 9000000000509cfc do_task_stat+0x29c/0xaf0
>     ERA: 9000000000291308 __pv_queued_spin_unlock_slowpath+0xf8/0x100
>    CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
>    PRMD: 00000000 (PPLV0 -PIE -PWE)
>           ...
>   Call Trace:
>   [<9000000000291308>] __pv_queued_spin_unlock_slowpath+0xf8/0x100
>   [<9000000000509cf8>] do_task_stat+0x298/0xaf0
>   [<9000000000502570>] proc_single_show+0x60/0xe0
>
> The problem is that memory accessing is out of order on LoongArch
> platform, there is contension between pv_unhash() and pv_hash().
>
> CPU0 pv_unhash:                                CPU1 pv_hash:
>
>    for_each_hash_entry(he, offset, hash) {        for_each_hash_entry(he, offset, hash) {
>      if (READ_ONCE(he->lock) == lock) {             struct qspinlock *old = NULL;
>        node = READ_ONCE(he->node);
>        WRITE_ONCE(he->lock, NULL);
>
> On LoongArch platform which is out of order, the execution order may be
> switched like this:
>>       WRITE_ONCE(he->lock, NULL);
>                                                      if (try_cmpxchg(&he->lock, &old, lock)) {
>                                                        WRITE_ONCE(he->node, node);
>                                                        return &he->lock;
>
> CPU1 pv_hash() is executing and watch that lock is set with NULL. Write
> he->node with node of new lock.
>>       node = READ_ONCE(he->node);
> READ_ONCE(he->node) on CPU0 will return node of new lock rather than itself.
The pv_hash_entry structure is supposed to be cacheline aligned. The 
use  of READ_ONCE/WRITE_ONCE will ensure that compiler won't rearrange 
the ordering. If the CPU can rearrange read/write ordering on the same 
cacheline like that, it may be some advance optimization technique that 
I don't  quite understand. Can you check if the buffer returned by 
alloc_large_system_hash() is really properly aligned?
>
> Here READ_ONCE/WRITE_ONCE is replaced with try_cmpxchg().
>
> Signed-off-by: Bibo Mao <maobibo@...ngson.cn>
> ---
>   kernel/locking/qspinlock_paravirt.h | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> index dc1cb90e3644..cc4eabce092d 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -240,9 +240,10 @@ static struct pv_node *pv_unhash(struct qspinlock *lock)
>   	struct pv_node *node;
>   
>   	for_each_hash_entry(he, offset, hash) {
> -		if (READ_ONCE(he->lock) == lock) {
> +		struct qspinlock *old = lock;
> +
> +		if (try_cmpxchg(&he->lock, &old, NULL)) {
>   			node = READ_ONCE(he->node);
> -			WRITE_ONCE(he->lock, NULL);
>   			return node;
>   		}
>   	}

Anyway, this change isn't quite right as suggested by Akira.

Cheers,
Longman

>
> base-commit: 48f506ad0b683d3e7e794efa60c5785c4fdc86fa


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ