[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dadf1e3b-7fa5-a218-4695-f617f639d878@huawei.com>
Date: Thu, 12 Oct 2023 20:38:46 +0800
From: Yicong Yang <yangyicong@...wei.com>
To: Robin Murphy <robin.murphy@....com>,
Jijie Shao <shaojijie@...wei.com>, <will@...nel.org>,
<jonathan.cameron@...wei.com>, <mark.rutland@....com>,
<yangyicong@...ilicon.com>
CC: <chenhao418@...wei.com>, <shenjian15@...wei.com>,
<wangjie125@...wei.com>, <liuyonglong@...wei.com>,
<linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH drivers/perf: hisi:] drivers/perf: hisi: fix NULL pointer
issue when uninstall hns3 pmu driver
Hi Robin,
On 2023/10/12 0:03, Robin Murphy wrote:
> On 11/10/2023 9:37 am, Jijie Shao wrote:
>>
>> on 2023/10/10 17:32, Yicong Yang wrote:
>>> Hi Jijie,
>>>
>>> On 2023/10/9 18:50, Jijie Shao wrote:
>>>> From: Hao Chen <chenhao418@...wei.com>
>>>>
>>>> When uninstall hns3 pmu driver, it will call cpuhp_state_remove_instance()
>>>> and then callback function hns3_pmu_offline_cpu() is called, it may cause
>>>> NULL pointer call trace when other driver is installing or uninstalling
>>>> concurrently.
>>>>
>>> More information about the calltrace you've met and how to reproduce this?
>>> I'm not sure why other drivers are involved.
>>>
>>>> As John Garry's opinion, cpuhp_state_remove_instance() is used for shared
>>>> interrupt, and using cpuhp_state_remove_instance_nocalls() is fine for PCIe
>>>> or HNS3 pmu.
>>>>
>>> I'm a bit confused here. We need to update the using CPU and migrate the perf
>>> context as well as the interrupt affinity in cpuhp::teardown() callback, so
>>> it make sense to not call this on driver detachment. But I cannot figure
>>> out why this is related to the shared interrupt, more details?
>>>
>> ok,I will send v2 to add more details.
>
> This shouldn't have anything to do with concurrency or shared interrupts or anything else. It's simply that we should clearly not attempt to migrate a PMU context (via invoking the hotplug callbacks) *after* the relevant PMU has already been unregistered, since that's liable to lead to some kind of use-after-free, and at best it's just a pointless waste of time anyway - if we've got to the point of unbinding the driver (or failing to probe at all), there should definitely not be any active events or other PMU state that needs updating.
>
Thanks for the clarification. I think this is the root reason
of the problem met on the hns3 pmu.
Thanks.
Powered by blists - more mailing lists