linux-kernel - Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+=Sn1n6e9pLDRVWT1EzDEn2XWwoK-Ep__-R6VVPzhbMvCEdNg@mail.gmail.com>
Date:	Thu, 10 Dec 2015 20:51:34 -0800
From:	Andrew Pinski <andrew.pinski@...iumnetworks.com>
To:	Will Deacon <will.deacon@....com>,
	Davidlohr Bueso <dbueso@...e.de>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	Andrew <Andrew.Pinski@...iumnetworks.com>
Subject: Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release
 semantics) causing failures on arm64 (ThunderX)

On Thu, Dec 10, 2015 at 7:29 PM, Andrew Pinski <pinskia@...il.com> wrote:
> On Thu, Dec 10, 2015 at 11:44 AM, David Danny wrote:
>>
>> Hi,
>>
>> We are getting soft lockup OOPs on Cavium CN88XX (A.K.A. ThunderX), which is an arm64 implementation.
>
> I get a slightly different OOPs and reverting
> c55a6ffa6285e29f874ed403979472631ec70bff I was able to boot.
> What I saw with osq_lock.c was that osq_wait_next is called for both
> lock and unlock case so it might need both barriers.
> The other question comes does atomic_cmpxchg_release have release
> semantics when the compare fails?  Right now it does not.

So looking further I think I understand what is going wrong and why
c55a6ffa6285e29f874ed403979472631ec70bff is incorrect.

The compare and swap inside osq_lock needs to be both release and
acquire semantics memory barriers because the stores (to node) need to
be visible to the other cores before the setting of lock->tail
happens.
Because if node->next is is up to date, we might end up in
osq_wait_next and waiting in an infinite loop while waiting on
ourselves.

I think we should revert c55a6ffa6285e29f874ed403979472631ec70bff
fully as mentioned for the reasons above.

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
>
>>
>> A typical failure shows multiple threads stuck in mutex operations like
>> this:
>>
>> .
>> .
>> .
>> [   68.909873] Task dump for CPU 18:
>> [   68.909876] systemd-udevd   R  running task        0   537    534
>> 0x00000002
>> [   68.909877] Call trace:
>> [   68.909880] [<fffffe0000088858>] dump_backtrace+0x0/0x17c
>> [   68.909883] [<fffffe00000889f8>] show_stack+0x24/0x2c
>> [   68.909885] [<fffffe00000c4210>] sched_show_task+0xb0/0x104
>> [   68.909888] [<fffffe00000c682c>] dump_cpu_task+0x48/0x54
>> [   68.909890] [<fffffe00000ee5e0>] rcu_dump_cpu_stacks+0x9c/0xec
>> [   68.909893] [<fffffe00000f2c9c>] rcu_check_callbacks+0x524/0xa18
>> [   68.909896] [<fffffe00000f83a0>] update_process_times+0x44/0x74
>> [   68.909899] [<fffffe00001078d4>] tick_sched_timer+0x78/0x1ac
>> [   68.909901] [<fffffe00000f8b74>] __hrtimer_run_queues+0x148/0x2d4
>> [   68.909903] [<fffffe00000f9464>] hrtimer_interrupt+0xb0/0x1f4
>> [   68.909906] [<fffffe000056e6e8>] arch_timer_handler_phys+0x3c/0x48
>> [   68.909909] [<fffffe00000e7fd4>] handle_percpu_devid_irq+0xb0/0x1b0
>> [   68.909912] [<fffffe00000e33c4>] generic_handle_irq+0x34/0x4c
>> [   68.909914] [<fffffe00000e3738>] __handle_domain_irq+0x90/0xfc
>> [   68.909916] [<fffffe0000081d80>] gic_handle_irq+0x90/0x18c
>> [   68.909918] Exception stack(0xfffffe03f14e3920 to 0xfffffe03f14e3a40)
>> [   68.909921] 3920: fffffe03fd5c5800 fffffe0000c55800 fffffe03f14e3a80
>> fffffe00000dabd8
>> [   68.909924] 3940: 00000000a0000145 0000000000000015 fffffe03e9602400
>> fffffe00002fddb0
>> [   68.909927] 3960: 0000000000000000 0000000000000000 fffffe03fd5c5810
>> fffffe03f14e0000
>> [   68.909929] 3980: 0000000000000001 ffffffffff000000 fffffe03db307e38
>> 0000000000000000
>> [   68.909932] 39a0: 0000000000737973 00000000ffffffff 0000000000000000
>> 000000003b364d50
>> [   68.909935] 39c0: 0000000000000018 ffffffffa99641af 0016fd71b6000000
>> 003b9aca00000000
>> [   68.909937] 39e0: fffffe00001f1508 000003ff9b9fd028 000003ffed7a0a10
>> fffffe03fd5c5800
>> [   68.909940] 3a00: fffffe0000c55800 fffffe0000cea1c8 fffffe03fd5a5800
>> fffffe0000ca2eb0
>> [   68.909943] 3a20: 0000000000000015 fffffe03e9602400 fffffe0000cea1c8
>> fffffe0000712000
>> [   68.909945] [<fffffe0000084ce8>] el1_irq+0x68/0xd8
>> [   68.909948] [<fffffe00000da03c>] mutex_optimistic_spin+0x9c/0x1d0
>> [   68.909951] [<fffffe00006fe4b8>] __mutex_lock_slowpath+0x44/0x158
>> [   68.909953] [<fffffe00006fe620>] mutex_lock+0x54/0x58
>> [   68.909956] [<fffffe0000265efc>] kernfs_iop_permission+0x38/0x70
>> [   68.909959] [<fffffe00001fbf50>] __inode_permission+0x88/0xd8
>> [   68.909961] [<fffffe00001fbfd0>] inode_permission+0x30/0x6c
>> [   68.909964] [<fffffe00001fe26c>] link_path_walk+0x68/0x4d4
>> [   68.909966] [<fffffe00001ffa14>] path_openat+0xb4/0x2bc
>> [   68.909968] [<fffffe000020123c>] do_filp_open+0x74/0xd0
>> [   68.909971] [<fffffe00001f13e4>] do_sys_open+0x14c/0x228
>> [   68.909973] [<fffffe00001f1544>] SyS_openat+0x3c/0x48
>> [   68.909976] [<fffffe00000851f0>] el0_svc_naked+0x24/0x28
>> .
>> .
>> .
>>
>> Reverting 81a43adae3b9 (locking/mutex: Use acquire/release semantics) Makes the problem go away.
>>
>> At this point it is unknown if this patch is incorrect, or if the underlying ARM64 atomic_*_{acquire,release} primitives are defective, or if the problem lies elsewhere.
>>
>> I am not requesting any specific action with this e-mail, but wanted to draw attention to the issue.  Undoubtedly we will be able to provide more detailed information about the issue in the coming days.
>>
>> Thanks,
>> David Daney
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/