lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0052b62b-aafe-e2eb-6d66-4ad0178bdae1@loongson.cn>
Date: Wed, 24 Jul 2024 11:09:38 +0800
From: Hongchen Zhang <zhanghongchen@...ngson.cn>
To: Ethan Zhao <haifeng.zhao@...ux.intel.com>,
 Markus Elfring <Markus.Elfring@....de>, Bjorn Helgaas <bhelgaas@...gle.com>
Cc: Alex Belits <abelits@...vell.com>,
 "Peter Zijlstra (Intel)" <peterz@...radead.org>,
 Nitesh Narayan Lal <nitesh@...hat.com>,
 Frederic Weisbecker <frederic@...nel.org>, linux-pci@...r.kernel.org,
 linux-kernel@...r.kernel.org, loongarch@...ts.linux.dev,
 stable@...r.kernel.org, Huacai Chen <chenhuacai@...ngson.cn>
Subject: Re: [PATCH v3] PCI: pci_call_probe: call local_pci_probe() when
 selected cpu is offline

Hi Ethan,

On 2024/7/24 上午10:47, Ethan Zhao wrote:
> On 7/24/2024 9:58 AM, Hongchen Zhang wrote:
>> Hi Ethan,
>> On 2024/7/22 PM 3:39, Ethan Zhao wrote:
>>>
>>> On 6/13/2024 3:42 PM, Hongchen Zhang wrote:
>>>> Call work_on_cpu(cpu, fn, arg) in pci_call_probe() while the argument
>>>> @cpu is a offline cpu would cause system stuck forever.
>>>>
>>>> This can be happen if a node is online while all its CPUs are
>>>> offline (We can use "maxcpus=1" without "nr_cpus=1" to reproduce it).
>>>>
>>>> So, in the above case, let pci_call_probe() call local_pci_probe()
>>>> instead of work_on_cpu() when the best selected cpu is offline.
>>>>
>>>> Fixes: 69a18b18699b ("PCI: Restrict probe functions to housekeeping 
>>>> CPUs")
>>>> Cc: <stable@...r.kernel.org>
>>>> Signed-off-by: Huacai Chen <chenhuacai@...ngson.cn>
>>>> Signed-off-by: Hongchen Zhang <zhanghongchen@...ngson.cn>
>>>> ---
>>>> v2 -> v3: Modify commit message according to Markus's suggestion
>>>> v1 -> v2: Add a method to reproduce the problem
>>>> ---
>>>>   drivers/pci/pci-driver.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>>> index af2996d0d17f..32a99828e6a3 100644
>>>> --- a/drivers/pci/pci-driver.c
>>>> +++ b/drivers/pci/pci-driver.c
>>>> @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver 
>>>> *drv, struct pci_dev *dev,
>>>>           free_cpumask_var(wq_domain_mask);
>>>>       }
>>>> -    if (cpu < nr_cpu_ids)
>>>
>>> Why not choose the right cpu to callwork_on_cpu() ? the one that is 
>>> online. Thanks, Ethan
>> Yes, let housekeeping_cpumask() return online cpu is a good idea, but 
>> it may be changed by command line. so the simplest way is to call 
>> local_pci_probe when the best selected cpu is offline.
> 
> Hmm..... housekeeping_cpumask() should never return offline CPU, so
> I guess you didn't hit issue with the CPU isolation, but the following
> code seems not good.
The issue is the dev node is online but the best selected cpu is 
offline, so it seems that there is no better way to directly set the cpu 
to nr_cpu_ids.
> 
> ...
> 
> if (node < 0 || node >= MAX_NUMNODES || !node_online(node) ||
>          pci_physfn_is_probed(dev)) {
>          cpu = nr_cpu_ids;
>      } else {
> 
> ....
> 
> perhaps you could change the logic there and fix it  ?
> 
> Thanks
> Ethan
> 
> 
> 
>>>
>>>> +    if ((cpu < nr_cpu_ids) && cpu_online(cpu))
>>>>           error = work_on_cpu(cpu, local_pci_probe, &ddi);
>>>>       else
>>>>           error = local_pci_probe(&ddi);
>>
>>


-- 
Best Regards
Hongchen Zhang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ