lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <02d6c3a0-401f-46be-bd18-6cf83e9f7c42@windriver.com>
Date:   Fri, 16 Oct 2020 11:15:27 +0800
From:   "Xu, Yanfei" <yanfei.xu@...driver.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     marcel@...tmann.org, johan.hedberg@...il.com, davem@...emloft.net,
        kuba@...nel.org, linux-bluetooth@...r.kernel.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Bluetooth: Use lock_sock() when acquiring lock in
 sco_conn_del



On 10/14/20 8:31 PM, Hillf Danton wrote:
> 
> On Wed, 14 Oct 2020 15:17:31 +0800
>> From: Yanfei Xu <yanfei.xu@...driver.com>
>>
>> Locking slock-AF_BLUETOOTH-BTPROTO_SCO may happen in process context or
>> BH context. If in process context, we should use lock_sock(). As blow
>> warning, sco_conn_del() is called in process context, so let's use
>> lock_sock() instead of bh_lock_sock().
>>
> Sounds opposite because blocking BH in BH context provides no extra
> protection while it makes sense in the less critical context particularly
> wrt sock lock.
> 
>> ================================
>> WARNING: inconsistent lock state
>> 5.9.0-rc4-syzkaller #0 Not tainted
>> --------------------------------
>> inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>> syz-executor675/31233 [HC0[0]:SC0[0]:HE1:SE1] takes:
>> ffff8880a75c50a0 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.?.}-{2:2}, at:
>> spin_lock include/linux/spinlock.h:354 [inline]
>> ffff8880a75c50a0 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.?.}-{2:2}, at:
>> sco_conn_del+0x128/0x270 net/bluetooth/sco.c:176
>> {IN-SOFTIRQ-W} state was registered at:
>>    lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006
>>    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>    _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>>    spin_lock include/linux/spinlock.h:354 [inline]
>>    sco_sock_timeout+0x24/0x140 net/bluetooth/sco.c:83
>>    call_timer_fn+0x1ac/0x760 kernel/time/timer.c:1413
>>    expire_timers kernel/time/timer.c:1458 [inline]
>>    __run_timers.part.0+0x67c/0xaa0 kernel/time/timer.c:1755
>>    __run_timers kernel/time/timer.c:1736 [inline]
>>    run_timer_softirq+0xae/0x1a0 kernel/time/timer.c:1768
>>    __do_softirq+0x1f7/0xa91 kernel/softirq.c:298
>>    asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
>>    __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
>>    run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
>>    do_softirq_own_stack+0x9d/0xd0 arch/x86/kernel/irq_64.c:77
>>    invoke_softirq kernel/softirq.c:393 [inline]
>>    __irq_exit_rcu kernel/softirq.c:423 [inline]
>>    irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
>>    sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
>>    asm_sysvec_apic_timer_interrupt+0x12/0x20
>>    arch/x86/include/asm/idtentry.h:581
>>    unwind_next_frame+0x139a/0x1f90 arch/x86/kernel/unwind_orc.c:607
>>    arch_stack_walk+0x81/0xf0 arch/x86/kernel/stacktrace.c:25
>>    stack_trace_save+0x8c/0xc0 kernel/stacktrace.c:123
>>    kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
>>    kasan_set_track mm/kasan/common.c:56 [inline]
>>    __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
>>    slab_post_alloc_hook mm/slab.h:518 [inline]
>>    slab_alloc mm/slab.c:3312 [inline]
>>    kmem_cache_alloc+0x13a/0x3a0 mm/slab.c:3482
>>    __d_alloc+0x2a/0x950 fs/dcache.c:1709
>>    d_alloc+0x4a/0x230 fs/dcache.c:1788
>>    d_alloc_parallel+0xe9/0x18e0 fs/dcache.c:2540
>>    lookup_open.isra.0+0x9ac/0x1350 fs/namei.c:3030
>>    open_last_lookups fs/namei.c:3177 [inline]
>>    path_openat+0x96d/0x2730 fs/namei.c:3365
>>    do_filp_open+0x17e/0x3c0 fs/namei.c:3395
>>    do_sys_openat2+0x16d/0x420 fs/open.c:1168
>>    do_sys_open fs/open.c:1184 [inline]
>>    __do_sys_open fs/open.c:1192 [inline]
>>    __se_sys_open fs/open.c:1188 [inline]
>>    __x64_sys_open+0x119/0x1c0 fs/open.c:1188
>>    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> irq event stamp: 853
>> hardirqs last  enabled at (853): [<ffffffff87f733af>]
>> __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
>> hardirqs last  enabled at (853): [<ffffffff87f733af>]
>> _raw_spin_unlock_irq+0x1f/0x80 kernel/locking/spinlock.c:199
>> hardirqs last disabled at (852): [<ffffffff87f73764>]
>> __raw_spin_lock_irq include/linux/spinlock_api_smp.h:126 [inline]
>> hardirqs last disabled at (852): [<ffffffff87f73764>]
>> _raw_spin_lock_irq+0xa4/0xd0 kernel/locking/spinlock.c:167
>> softirqs last  enabled at (0): [<ffffffff8144c929>]
>> copy_process+0x1a99/0x6920 kernel/fork.c:2018
>> softirqs last disabled at (0): [<0000000000000000>] 0x0
>>
>> other info that might help us debug this:
>>   Possible unsafe locking scenario:
>>
>>         CPU0
>>         ----
>>    lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
>>    <Interrupt>
>>      lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
>>
>>   *** DEADLOCK ***
>>
>> 3 locks held by syz-executor675/31233:
>>   #0: ffff88809f104f40 (&hdev->req_lock){+.+.}-{3:3}, at:
>> hci_dev_do_close+0xf5/0x1080 net/bluetooth/hci_core.c:1720
>>   #1: ffff88809f104078 (&hdev->lock){+.+.}-{3:3}, at:
>> hci_dev_do_close+0x253/0x1080 net/bluetooth/hci_core.c:1757
>>   #2: ffffffff8a9188c8 (hci_cb_list_lock){+.+.}-{3:3}, at:
>> hci_disconn_cfm include/net/bluetooth/hci_core.h:1435 [inline]
>>   #2: ffffffff8a9188c8 (hci_cb_list_lock){+.+.}-{3:3}, at:
>> hci_conn_hash_flush+0xc7/0x220 net/bluetooth/hci_conn.c:1557
>>
>> stack backtrace:
>> CPU: 1 PID: 31233 Comm: syz-executor675 Not tainted 5.9.0-rc4-syzkaller
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>   __dump_stack lib/dump_stack.c:77 [inline]
>>   dump_stack+0x198/0x1fd lib/dump_stack.c:118
>>   print_usage_bug kernel/locking/lockdep.c:4020 [inline]
>>   valid_state kernel/locking/lockdep.c:3361 [inline]
>>   mark_lock_irq kernel/locking/lockdep.c:3560 [inline]
>>   mark_lock.cold+0x7a/0x7f kernel/locking/lockdep.c:4006
>>   mark_usage kernel/locking/lockdep.c:3923 [inline]
>>   __lock_acquire+0x876/0x5570 kernel/locking/lockdep.c:4380
>>   lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006
>>   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>   _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>>   spin_lock include/linux/spinlock.h:354 [inline]
>>   sco_conn_del+0x128/0x270 net/bluetooth/sco.c:176
>>   sco_disconn_cfm net/bluetooth/sco.c:1178 [inline]
>>   sco_disconn_cfm+0x62/0x80 net/bluetooth/sco.c:1171
>>   hci_disconn_cfm include/net/bluetooth/hci_core.h:1438 [inline]
>>   hci_conn_hash_flush+0x114/0x220 net/bluetooth/hci_conn.c:1557
>>   hci_dev_do_close+0x5c6/0x1080 net/bluetooth/hci_core.c:1770
>>   hci_unregister_dev+0x1bd/0xe30 net/bluetooth/hci_core.c:3790
>>   vhci_release+0x70/0xe0 drivers/bluetooth/hci_vhci.c:340
>>   __f put+0x285/0x920 fs/file_table.c:281
>>   task_work_run+0xdd/0x190 kernel/task_work.c:141
>>   exit_task_work include/linux/task_work.h:25 [inline]
>>   do_exit+0xb7d/0x29f0 kernel/exit.c:806
>>   do_group_exit+0x125/0x310 kernel/exit.c:903
>>   get_signal+0x428/0x1f00 kernel/signal.c:2757
>>   arch_do_signal+0x82/0x2520 arch/x86/kernel/signal.c:811
>>   exit_to_user_mode_loop kernel/entry/common.c:159 [inline]
>>   exit_to_user_mode_prepare+0x1ae/0x200 kernel/entry/common.c:190
>>   syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
>>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> RIP: 0033:0x447279
>>
>> Reported-by: syzbot+65684128cd7c35bc66a1@...kaller.appspotmail.com
>> Signed-off-by: Yanfei Xu <yanfei.xu@...driver.com>
>> ---
>>   net/bluetooth/sco.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
>> index dcf7f96ff417..559b883c815f 100644
>> --- a/net/bluetooth/sco.c
>> +++ b/net/bluetooth/sco.c
>> @@ -173,10 +173,10 @@ static void sco_conn_del(struct hci_conn *hcon, int err)
>>   
>>   	if (sk) {
>>   		sock_hold(sk);
>> -		bh_lock_sock(sk);
>> +		lock_sock(sk);
>>   		sco_sock_clear_timer(sk);
>>   		sco_chan_del(sk, err);
>> -		bh_unlock_sock(sk);
>> +		release_sock(sk);
>>   		sco_sock_kill(sk);
>>   		sock_put(sk);
>>   	}
>> -- 
>> 2.18.2
> 
> 
> --- a/net/bluetooth/sco.c
> +++ b/net/bluetooth/sco.c
> @@ -80,10 +80,10 @@ static void sco_sock_timeout(struct time
>   
>   	BT_DBG("sock %p state %d", sk, sk->sk_state);
>   
> -	bh_lock_sock(sk);
> +	lock_sock(sk);
>   	sk->sk_err = ETIMEDOUT;
>   	sk->sk_state_change(sk);
> -	bh_unlock_sock(sk);
> +	unlock_sock(sk);
>   
>   	sco_sock_kill(sk);
>   	sock_put(sk);
> 
Hi Hillf,

Thanks for your reply! But I don't clearly understand what you mean.

After your change, If sco_conn_del() have got the lock and then run into 
sco_sock_timeout which is in BH, the potential deadlock is still exsit.

As the function define, use bh_lock_sock in sco_sock_timeout(BH context) 
is right. The root cause is prevent from locking in BH after we've got 
the lock in sco_conn_del, isn't it?

/* BH context may only use the following locking interface. */
#define bh_lock_sock(__sk)      spin_lock(&((__sk)->sk_lock.slock))


Regards,
Yanfei




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ