[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68ccef70-ef30-8f53-6ec5-17ce5815089c@linaro.org>
Date: Tue, 17 May 2022 17:28:51 +0200
From: Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>
To: duoming@....edu.cn
Cc: linux-kernel@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
edumazet@...gle.com, pabeni@...hat.com, gregkh@...uxfoundation.org,
alexander.deucher@....com, broonie@...nel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH net v2] NFC: hci: fix sleep in atomic context bugs in
nfc_hci_hcp_message_tx
On 17/05/2022 17:25, duoming@....edu.cn wrote:
> Hello,
>
> On Tue, 17 May 2022 13:42:41 +0200 Krzysztof wrote:
>
>> On 17/05/2022 12:55, Duoming Zhou wrote:
>>> There are sleep in atomic context bugs when the request to secure
>>> element of st21nfca is timeout. The root cause is that kzalloc and
>>> alloc_skb with GFP_KERNEL parameter and mutex_lock are called in
>>> st21nfca_se_wt_timeout which is a timer handler. The call tree shows
>>> the execution paths that could lead to bugs:
>>>
>>> (Interrupt context)
>>> st21nfca_se_wt_timeout
>>> nfc_hci_send_event
>>> nfc_hci_hcp_message_tx
>>> kzalloc(..., GFP_KERNEL) //may sleep
>>> alloc_skb(..., GFP_KERNEL) //may sleep
>>> mutex_lock() //may sleep
>>>
>>> This patch changes allocation mode of kzalloc and alloc_skb from
>>> GFP_KERNEL to GFP_ATOMIC and changes mutex_lock to spin_lock in
>>> order to prevent atomic context from sleeping.
>>>
>>> Fixes: 2130fb97fecf ("NFC: st21nfca: Adding support for secure element")
>>> Signed-off-by: Duoming Zhou <duoming@....edu.cn>
>>> ---
>>> Changes in v2:
>>> - Change mutex_lock to spin_lock.
>>>
>>> include/net/nfc/hci.h | 3 ++-
>>> net/nfc/hci/core.c | 18 +++++++++---------
>>> net/nfc/hci/hcp.c | 10 +++++-----
>>> 3 files changed, 16 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/include/net/nfc/hci.h b/include/net/nfc/hci.h
>>> index 756c11084f6..8f66e6e6b91 100644
>>> --- a/include/net/nfc/hci.h
>>> +++ b/include/net/nfc/hci.h
>>> @@ -103,7 +103,8 @@ struct nfc_hci_dev {
>>>
>>> bool shutting_down;
>>>
>>> - struct mutex msg_tx_mutex;
>>> + /* The spinlock is used to protect resources related with hci message TX */
>>> + spinlock_t msg_tx_spin;
>>>
>>> struct list_head msg_tx_queue;
>>>
>>> diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
>>> index ceb87db57cd..fa22f9fe5fc 100644
>>> --- a/net/nfc/hci/core.c
>>> +++ b/net/nfc/hci/core.c
>>> @@ -68,7 +68,7 @@ static void nfc_hci_msg_tx_work(struct work_struct *work)
>>> struct sk_buff *skb;
>>> int r = 0;
>>>
>>> - mutex_lock(&hdev->msg_tx_mutex);
>>> + spin_lock(&hdev->msg_tx_spin);
>>> if (hdev->shutting_down)
>>> goto exit;
>>
>> How did you test your patch?
>>
>> Did you check, really check, that this can be an atomic (non-sleeping)
>> section?
>>
>> I have doubts because I found at least one path leading to device_lock
>> (which is a mutex) called within your new code.
>
> The nfc_hci_hcp_message_tx() is called by both process context(hci_dev_up and so on)
> and interrupt context(st21nfca_se_wt_timeout()). The process context(hci_dev_up and so on)
> calls device_lock, but I think calling spin_lock() within device_lock() is ok. There is
> no device_lock() called within spin_lock().
There is.
nfc_hci_failure -> spin lock -> nfc_driver_failure -> nfc_targets_found
-> device_lock
I found it just by a very quick look, so I suspect there are several
other places, not really checked.
Best regards,
Krzysztof
Powered by blists - more mailing lists