[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d37ca6d9414720b2355d552fa8b68629@codeaurora.org>
Date: Fri, 14 Feb 2020 17:18:20 +0530
From: gubbaven@...eaurora.org
To: Stephen Boyd <swboyd@...omium.org>
Cc: johan.hedberg@...il.com, marcel@...tmann.org, mka@...omium.org,
linux-kernel@...r.kernel.org, linux-bluetooth@...r.kernel.org,
robh@...nel.org, hemantg@...eaurora.org,
linux-arm-msm@...r.kernel.org, bgodavar@...eaurora.org,
tientzu@...omium.org, seanpaul@...omium.org, rjliao@...eaurora.org,
yshavit@...gle.com
Subject: Re: [PATCH v3] Bluetooth: hci_qca: Bug fixes while collecting
controller memory dump
Hi Stephen,
On 2020-02-14 07:52, Stephen Boyd wrote:
> Quoting Venkata Lakshmi Narayana Gubba (2020-02-13 07:56:04)
>> This patch will fix the below issues
>> 1.Fixed race conditions while accessing memory dump state flags.
>
> What sort of race condition?
[Venkat]:
To avoid race condition between qca_hw_error() and
qca_controller_memdump() while accessing memory buffer, mutex is added.
In timeout scenario, qca_hw_error() frees memory dump buffers and
qca_controller_memdump() might still access same memory buffers.
We can avoid this situation by using mutex.
>
>> 2.Updated with actual context of timer in hci_memdump_timeout()
>
> What does this mean?
[Venkat]:
I will update commit text and post in next patch set.
>
>> 3.Updated injecting hardware error event if the dumps failed to
>> receive.
>> 4.Once timeout is triggered, stopping the memory dump collections.
>>
>> Possible scenarios while collecting memory dump:
>>
>> Scenario 1:
>>
>> Memdump event from firmware
>> Some number of memdump events with seq #
>> Hw error event
>> Reset
>>
>> Scenario 2:
>>
>> Memdump event from firmware
>> Some number of memdump events with seq #
>> Timeout schedules hw_error_event if hw error event is not received
>> already
>> hw_error_event clears the memdump activity
>> reset
>>
>> Scenario 3:
>>
>> hw_error_event sends memdump command to firmware and waits for
>> completion
>> Some number of memdump events with seq #
>> hw error event
>> reset
>>
>> Fixes: d841502c79e3 ("Bluetooth: hci_qca: Collect controller memory
>> dump during SSR")
>> Reported-by: Abhishek Pandit-Subedi <abhishekpandit@...omium.org>
>> Signed-off-by: Venkata Lakshmi Narayana Gubba
>> <gubbaven@...eaurora.org>
>> ---
> [...]
>> @@ -1449,6 +1465,23 @@ static void qca_hw_error(struct hci_dev *hdev,
>> u8 code)
>> bt_dev_info(hdev, "waiting for dump to complete");
>> qca_wait_for_dump_collection(hdev);
>> }
>> +
>> + if (qca->memdump_state != QCA_MEMDUMP_COLLECTED) {
>> + bt_dev_err(hu->hdev, "clearing allocated memory due to
>> memdump timeout");
>> + mutex_lock(&qca->hci_memdump_lock);
>
> Why is a mutex needed? Are crashes happening in parallel? It would be
> nice if the commit text mentioned why the mutex is added so that the
> reader doesn't have to figure it out.
>
[Venkat]:Explained in above answer.
>> + if (qca_memdump)
>> + memdump_buf = qca_memdump->memdump_buf_head;
Regards,
Lakshmi Narayana.
Powered by blists - more mailing lists