linux-kernel - Re: [PATCH v3] Bluetooth: hci_qca: Bug fixes while collecting controller memory dump

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d37ca6d9414720b2355d552fa8b68629@codeaurora.org>
Date:   Fri, 14 Feb 2020 17:18:20 +0530
From:   gubbaven@...eaurora.org
To:     Stephen Boyd <swboyd@...omium.org>
Cc:     johan.hedberg@...il.com, marcel@...tmann.org, mka@...omium.org,
        linux-kernel@...r.kernel.org, linux-bluetooth@...r.kernel.org,
        robh@...nel.org, hemantg@...eaurora.org,
        linux-arm-msm@...r.kernel.org, bgodavar@...eaurora.org,
        tientzu@...omium.org, seanpaul@...omium.org, rjliao@...eaurora.org,
        yshavit@...gle.com
Subject: Re: [PATCH v3] Bluetooth: hci_qca: Bug fixes while collecting
 controller memory dump

Hi Stephen,
On 2020-02-14 07:52, Stephen Boyd wrote:
> Quoting Venkata Lakshmi Narayana Gubba (2020-02-13 07:56:04)
>> This patch will fix the below issues
>>    1.Fixed race conditions while accessing memory dump state flags.
> 
> What sort of race condition?
[Venkat]:
To avoid race condition between qca_hw_error() and 
qca_controller_memdump() while accessing memory buffer, mutex is added.
In timeout scenario, qca_hw_error() frees memory dump buffers and 
qca_controller_memdump() might still access same memory buffers.
We can avoid this situation by using mutex.
> 
>>    2.Updated with actual context of timer in hci_memdump_timeout()
> 
> What does this mean?
[Venkat]:
I will update commit text and post in next patch set.
> 
>>    3.Updated injecting hardware error event if the dumps failed to 
>> receive.
>>    4.Once timeout is triggered, stopping the memory dump collections.
>> 
>> Possible scenarios while collecting memory dump:
>> 
>> Scenario 1:
>> 
>> Memdump event from firmware
>> Some number of memdump events with seq #
>> Hw error event
>> Reset
>> 
>> Scenario 2:
>> 
>> Memdump event from firmware
>> Some number of memdump events with seq #
>> Timeout schedules hw_error_event if hw error event is not received 
>> already
>> hw_error_event clears the memdump activity
>> reset
>> 
>> Scenario 3:
>> 
>> hw_error_event sends memdump command to firmware and waits for 
>> completion
>> Some number of memdump events with seq #
>> hw error event
>> reset
>> 
>> Fixes: d841502c79e3 ("Bluetooth: hci_qca: Collect controller memory 
>> dump during SSR")
>> Reported-by: Abhishek Pandit-Subedi <abhishekpandit@...omium.org>
>> Signed-off-by: Venkata Lakshmi Narayana Gubba 
>> <gubbaven@...eaurora.org>
>> ---
> [...]
>> @@ -1449,6 +1465,23 @@ static void qca_hw_error(struct hci_dev *hdev, 
>> u8 code)
>>                 bt_dev_info(hdev, "waiting for dump to complete");
>>                 qca_wait_for_dump_collection(hdev);
>>         }
>> +
>> +       if (qca->memdump_state != QCA_MEMDUMP_COLLECTED) {
>> +               bt_dev_err(hu->hdev, "clearing allocated memory due to 
>> memdump timeout");
>> +               mutex_lock(&qca->hci_memdump_lock);
> 
> Why is a mutex needed? Are crashes happening in parallel? It would be
> nice if the commit text mentioned why the mutex is added so that the
> reader doesn't have to figure it out.
> 
[Venkat]:Explained in above answer.
>> +               if (qca_memdump)
>> +                       memdump_buf = qca_memdump->memdump_buf_head;

Regards,
Lakshmi Narayana.