lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91cf33b4-7f67-4f3a-b095-e8f04d8c18e9@linux.alibaba.com>
Date: Fri, 24 Oct 2025 11:38:10 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
 linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
 kbusch@...nel.org, sathyanarayanan.kuppuswamy@...ux.intel.com,
 mahesh@...ux.ibm.com, oohall@...il.com, Jonathan.Cameron@...wei.com,
 terry.bowman@....com, tianruidong@...ux.alibaba.com
Subject: Re: [PATCH v6 4/5] PCI/ERR: Use pcie_aer_is_native() to check for
 native AER control



在 2025/10/24 11:14, Lukas Wunner 写道:
> On Fri, Oct 24, 2025 at 11:09:25AM +0800, Shuai Xue wrote:
>> 2025/10/23 18:29, Lukas Wunner:
>>> On Mon, Oct 20, 2025 at 10:45:31PM +0800, Shuai Xue wrote:
>>>>  From PCIe spec, BIT 0-2 are logged for functions supporting Advanced
>>>> Error Handling.
>>>>
>>>> I am not sure if we should clear BIT 3, and also BIT 6 (Emergency Powerjj
>>>> Reduction Detected) and in case a AER error.
>>>
>>> AFAIUI, bits 0 to 3 are what the PCIe r7.0 sec 6.2.1 calls
>>> "baseline capability" error reporting.  They're supported
>>> even if AER is not supported.
>>>
>>> Bit 6 has nothing to do with this AFAICS.
>>
>> Per PCIe r7.0 section 7.5.3.5:
>>
>>    **For Functions supporting Advanced Error Handling**, errors are logged
>>    in this register regardless of the settings of the Uncorrectable Error
>>    Mask register. Default value of this bit is 0b.
>>
>>  From this, it's clear that bits 0 to 2 are not logged unless AER is supported.
> 
> No.  It just means that if AER is supported, the Uncorrectable Error Mask
> register has no bearing on whether the bits in the Device Status register
> are set.  It does not mean that the bits are only set if AER is supported.
> 

Thank you for pointing that out. I now understand that my interpretation
was incorrect.

As such, I will drop this patch that introduced the dev->aer_cap check.

The remaining question is whether it would make more sense to rename
pcie_clear_device_status() to pci_clear_device_error_status() and refine
its behavior by adding a mask specifically for bits 0 to 3. Here’s an
example of the proposed change:

-void pcie_clear_device_status(struct pci_dev *dev)
+void pci_clear_device_error_status(struct pci_dev *dev)
  {
         u16 sta;

         pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
-       pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
+       /* clear error-related bits: 0-3   */
+       pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & 0xF);
  }

Renaming the function to pci_clear_device_error_status() better
reflects its current focus on clearing error-related bits, and
introducing the mask ensures that only those relevant bits (0-3) are
cleared, rather than modifying the entire register. What do you think
about these changes?

Thanks.
Shuai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ