linux-kernel - Re: [PATCH net V2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57eae113432e286b7e279102220c21fcf0bd1306.camel@redhat.com>
Date:   Tue, 26 Apr 2022 13:17:21 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Duoming Zhou <duoming@....edu.cn>, krzysztof.kozlowski@...aro.org,
        linux-kernel@...r.kernel.org
Cc:     davem@...emloft.net, gregkh@...uxfoundation.org,
        alexander.deucher@....com, broonie@...nel.org,
        akpm@...ux-foundation.org, netdev@...r.kernel.org, linma@....edu.cn
Subject: Re: [PATCH net V2] nfc: nfcmrvl: main: reorder destructive
 operations in nfcmrvl_nci_unregister_dev to avoid bugs

On Mon, 2022-04-25 at 17:58 +0800, Duoming Zhou wrote:
> There are destructive operations such as nfcmrvl_fw_dnld_abort and
> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
> gpio and so on could be destructed while the upper layer functions such as
> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
> to double-free, use-after-free and null-ptr-deref bugs.
> 
> There are three situations that could lead to double-free bugs.
> 
> The first situation is shown below:
> 
>    (Thread 1)                 |      (Thread 2)
> nfcmrvl_fw_dnld_start         |
>  ...                          |  nfcmrvl_nci_unregister_dev
>  release_firmware()           |   nfcmrvl_fw_dnld_abort
>   kfree(fw) //(1)             |    fw_dnld_over
>                               |     release_firmware
>   ...                         |      kfree(fw) //(2)
>                               |     ...
> 
> The second situation is shown below:
> 
>    (Thread 1)                 |      (Thread 2)
> nfcmrvl_fw_dnld_start         |
>  ...                          |
>  mod_timer                    |
>  (wait a time)                |
>  fw_dnld_timeout              |  nfcmrvl_nci_unregister_dev
>    fw_dnld_over               |   nfcmrvl_fw_dnld_abort
>     release_firmware          |    fw_dnld_over
>      kfree(fw) //(1)          |     release_firmware
>      ...                      |      kfree(fw) //(2)
> 
> The third situation is shown below:
> 
>        (Thread 1)               |       (Thread 2)
> nfcmrvl_nci_recv_frame          |
>  if(..->fw_download_in_progress)|
>   nfcmrvl_fw_dnld_recv_frame    |
>    queue_work                   |
>                                 |
> fw_dnld_rx_work                 | nfcmrvl_nci_unregister_dev
>  fw_dnld_over                   |  nfcmrvl_fw_dnld_abort
>   release_firmware              |   fw_dnld_over
>    kfree(fw) //(1)              |    release_firmware
>                                 |     kfree(fw) //(2)
> 
> The firmware struct is deallocated in position (1) and deallocated
> in position (2) again.
> 
> The crash trace triggered by POC is like below:
> 
> [  122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0
> [  122.640457] Call Trace:
> [  122.640457]  <TASK>
> [  122.640457]  kfree+0xb0/0x330
> [  122.640457]  fw_dnld_over+0x28/0xf0
> [  122.640457]  nfcmrvl_nci_unregister_dev+0x61/0x70
> [  122.640457]  nci_uart_tty_close+0x87/0xd0
> [  122.640457]  tty_ldisc_kill+0x3e/0x80
> [  122.640457]  tty_ldisc_hangup+0x1b2/0x2c0
> [  122.640457]  __tty_hangup.part.0+0x316/0x520
> [  122.640457]  tty_release+0x200/0x670
> [  122.640457]  __fput+0x110/0x410
> [  122.640457]  task_work_run+0x86/0xd0
> [  122.640457]  exit_to_user_mode_prepare+0x1aa/0x1b0
> [  122.640457]  syscall_exit_to_user_mode+0x19/0x50
> [  122.640457]  do_syscall_64+0x48/0x90
> [  122.640457]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  122.640457] RIP: 0033:0x7f68433f6beb
> 
> What's more, there are also use-after-free and null-ptr-deref bugs
> in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or
> set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev,
> then, we dereference firmware, gpio or the members of priv->fw_dnld in
> nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
> 
> This patch reorders destructive operations after nci_unregister_device
> to avoid the double-free, UAF and NPD bugs, as nci_unregister_device
> is well synchronized and won't return if there is a running routine.
> This was mentioned in commit 3e3b5dfcd16a ("NFC: reorder the logic in
> nfc_{un,}register_device").

It looks like the above is not enough to close all the possible races,
specifically it looks like fw_dnld_timeout() and fw_dnld_rx_work() may
still race one vs another. 

I *think* that the approach you already suggested here:

https://lore.kernel.org/netdev/1d34425a0ea8a553a66dcf4f22ca55cc920dbb42.1649913521.git.duoming@zju.edu.cn/

should be safer - but you have to protect with the same spinlock even
every fw_dnld->fw modification.

@Lin Ma: I see you don't like the spinlock solution, but this other
option looks racing. Do you have other suggestions? (and/or would you
reconsider the spinlock?)

Thanks!

Paolo