linux-kernel - Re: [PATCH] nfc: llcp: Fix race in handling llcp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <188f255ca50e0e7a46e0fd139982e6ee3652bd7f.camel@redhat.com>
Date:   Thu, 01 Dec 2022 12:25:13 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     Wang ShaoBo <bobo.shaobowang@...wei.com>
Cc:     liwei391@...wei.com, sameo@...ux.intel.com, kuba@...nel.org,
        davem@...emloft.net, syzkaller-bugs@...glegroups.com,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfc: llcp: Fix race in handling llcp_devices

On Tue, 2022-11-29 at 17:44 +0800, Wang ShaoBo wrote:
> There are multiple path operate llcp_devices list without protection:
> 
>          CPU0                        CPU1
> 
> nfc_unregister_device()        nfc_register_device()
>  nfc_llcp_unregister_device()    nfc_llcp_register_device() //no lock
>     ...                            list_add(local->list, llcp_devices)
>     local_release()
>       list_del(local->list)
> 
>         CPU2
> ...
>  nfc_llcp_find_local()
>    list_for_each_entry(,&llcp_devices,)
> 
> So reach race condition if two of the three occur simultaneously like
> following crash report, although there is no reproduction script in
> syzbot currently, our artificially constructed use cases can also
> reproduce it:
> 
> list_del corruption. prev->next should be ffff888060ce7000, but was ffff88802a0ad000. (prev=ffffffff8e536240)
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:59!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 0 PID: 16622 Comm: syz-executor.5 Not tainted 6.1.0-rc6-next-20221125-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> RIP: 0010:__list_del_entry_valid.cold+0x12/0x72 lib/list_debug.c:59
> Code: f0 ff 0f 0b 48 89 f1 48 c7 c7 60 96 a6 8a 4c 89 e6 e8 4b 29 f0 ff 0f 0b 4c 89 e1 48 89 ee 48 c7 c7 c0 98 a6 8a e8 37 29 f0 ff <0f> 0b 48 89 ee 48 c7 c7 a0 97 a6 8a e8 26 29 f0 ff 0f 0b 4c 89 e2
> RSP: 0018:ffffc900151afd58 EFLAGS: 00010282
> RAX: 000000000000006d RBX: 0000000000000001 RCX: 0000000000000000
> RDX: ffff88801e7eba80 RSI: ffffffff8166001c RDI: fffff52002a35f9d
> RBP: ffff888060ce7000 R08: 000000000000006d R09: 0000000000000000
> R10: 0000000080000000 R11: 0000000000000000 R12: ffffffff8e536240
> R13: ffff88801f3f3000 R14: ffff888060ce1000 R15: ffff888079d855f0
> FS:  0000555556f57400(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f095d5ad988 CR3: 000000002155a000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  __list_del_entry include/linux/list.h:134 [inline]
>  list_del include/linux/list.h:148 [inline]
>  local_release net/nfc/llcp_core.c:171 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  nfc_llcp_local_put net/nfc/llcp_core.c:181 [inline]
>  nfc_llcp_local_put net/nfc/llcp_core.c:176 [inline]
>  nfc_llcp_unregister_device+0xb8/0x260 net/nfc/llcp_core.c:1619
>  nfc_unregister_device+0x196/0x330 net/nfc/core.c:1179
>  virtual_ncidev_close+0x52/0xb0 drivers/nfc/virtual_ncidev.c:163
>  __fput+0x27c/0xa90 fs/file_table.c:320
>  task_work_run+0x16f/0x270 kernel/task_work.c:179
>  resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
>  exit_to_user_mode_prepare+0x23c/0x250 kernel/entry/common.c:203
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>  do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> This patch add specific mutex lock llcp_devices_list_lock to ensure
> handling llcp_devices list safety.

Why a mutex instead of a spinlock? all the critical sections are very
small (both code and time-wise), while the list of callers reaching
that code is quite large making hard to check each of them is really in
process context.

Please switch to a spinlock instead.

Cheers,

Paolo