lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 May 2022 17:28:51 +0200
From:   Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>
To:     duoming@....edu.cn
Cc:     linux-kernel@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
        edumazet@...gle.com, pabeni@...hat.com, gregkh@...uxfoundation.org,
        alexander.deucher@....com, broonie@...nel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH net v2] NFC: hci: fix sleep in atomic context bugs in
 nfc_hci_hcp_message_tx

On 17/05/2022 17:25, duoming@....edu.cn wrote:
> Hello,
> 
> On Tue, 17 May 2022 13:42:41 +0200 Krzysztof wrote:
> 
>> On 17/05/2022 12:55, Duoming Zhou wrote:
>>> There are sleep in atomic context bugs when the request to secure
>>> element of st21nfca is timeout. The root cause is that kzalloc and
>>> alloc_skb with GFP_KERNEL parameter and mutex_lock are called in
>>> st21nfca_se_wt_timeout which is a timer handler. The call tree shows
>>> the execution paths that could lead to bugs:
>>>
>>>    (Interrupt context)
>>> st21nfca_se_wt_timeout
>>>   nfc_hci_send_event
>>>     nfc_hci_hcp_message_tx
>>>       kzalloc(..., GFP_KERNEL) //may sleep
>>>       alloc_skb(..., GFP_KERNEL) //may sleep
>>>       mutex_lock() //may sleep
>>>
>>> This patch changes allocation mode of kzalloc and alloc_skb from
>>> GFP_KERNEL to GFP_ATOMIC and changes mutex_lock to spin_lock in
>>> order to prevent atomic context from sleeping.
>>>
>>> Fixes: 2130fb97fecf ("NFC: st21nfca: Adding support for secure element")
>>> Signed-off-by: Duoming Zhou <duoming@....edu.cn>
>>> ---
>>> Changes in v2:
>>>   - Change mutex_lock to spin_lock.
>>>
>>>  include/net/nfc/hci.h |  3 ++-
>>>  net/nfc/hci/core.c    | 18 +++++++++---------
>>>  net/nfc/hci/hcp.c     | 10 +++++-----
>>>  3 files changed, 16 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/include/net/nfc/hci.h b/include/net/nfc/hci.h
>>> index 756c11084f6..8f66e6e6b91 100644
>>> --- a/include/net/nfc/hci.h
>>> +++ b/include/net/nfc/hci.h
>>> @@ -103,7 +103,8 @@ struct nfc_hci_dev {
>>>  
>>>  	bool shutting_down;
>>>  
>>> -	struct mutex msg_tx_mutex;
>>> +	/* The spinlock is used to protect resources related with hci message TX */
>>> +	spinlock_t msg_tx_spin;
>>>  
>>>  	struct list_head msg_tx_queue;
>>>  
>>> diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
>>> index ceb87db57cd..fa22f9fe5fc 100644
>>> --- a/net/nfc/hci/core.c
>>> +++ b/net/nfc/hci/core.c
>>> @@ -68,7 +68,7 @@ static void nfc_hci_msg_tx_work(struct work_struct *work)
>>>  	struct sk_buff *skb;
>>>  	int r = 0;
>>>  
>>> -	mutex_lock(&hdev->msg_tx_mutex);
>>> +	spin_lock(&hdev->msg_tx_spin);
>>>  	if (hdev->shutting_down)
>>>  		goto exit;
>>
>> How did you test your patch?
>>
>> Did you check, really check, that this can be an atomic (non-sleeping)
>> section?
>>
>> I have doubts because I found at least one path leading to device_lock
>> (which is a mutex) called within your new code.
> 
> The nfc_hci_hcp_message_tx() is called by both process context(hci_dev_up and so on)
> and interrupt context(st21nfca_se_wt_timeout()). The process context(hci_dev_up and so on)
> calls device_lock, but I think calling spin_lock() within device_lock() is ok. There is
> no device_lock() called within spin_lock(). 

There is.

nfc_hci_failure -> spin lock -> nfc_driver_failure -> nfc_targets_found
-> device_lock

I found it just by a very quick look, so I suspect there are several
other places, not really checked.

Best regards,
Krzysztof

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ