[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a63012d4-0c98-4022-8183-5a3488ca66e9@csgroup.eu>
Date: Thu, 2 Oct 2025 12:06:37 +0200
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@...ux.intel.com>,
Breno Leitao <leitao@...ian.org>, Mahesh J Salgaonkar
<mahesh@...ux.ibm.com>, Oliver O'Halloran <oohall@...il.com>,
Bjorn Helgaas <bhelgaas@...gle.com>, Jon Pan-Doh <pandoh@...gle.com>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...a.com, stable@...r.kernel.org
Subject: Re: [PATCH RESEND] PCI/AER: Check for NULL aer_info before
ratelimiting in pci_print_aer()
Le 29/09/2025 à 17:10, Sathyanarayanan Kuppuswamy a écrit :
>
> On 9/29/25 2:15 AM, Breno Leitao wrote:
>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
>> does not rate limit, given this is fatal.
>>
>> This prevents a kernel crash triggered by dereferencing a NULL pointer
>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
>> AER info. This change aligns pci_print_aer() with
>> pci_dev_aer_stats_incr()
>> which already performs this NULL check.
>>
>> Cc: stable@...r.kernel.org
>> Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal
>> error logging")
>> Signed-off-by: Breno Leitao <leitao@...ian.org>
>> ---
>> - This problem is still happening in upstream, and unfortunately no
>> action
>> was done in the previous discussion.
>> - Link to previous post:
>> https://eur01.safelinks.protection.outlook.com/?
>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F20250804-aer_crash_2-v1-1-
>> fd06562c18a4%40debian.org&data=05%7C02%7Cchristophe.leroy2%40cs-
>> soprasteria.com%7Cfd3d2f1b4e8448a8e67608ddff6a4e70%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638947554250805439%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6yTN1%2Fq%2Fy0VKX%2BXpE%2BiKxBrn19AkY4IPj01N2ZdxEkg%3D&reserved=0
>> ---
>
> Although we haven't identified the path that triggers this issue, adding
> this check is harmless.
Is it really harmless ?
The purpose of the function is to ratelimit logs. Here by returning 1
when dev->aer_info is NULL it says: don't ratelimit. Isn't it an opened
door to Denial of Service by overloading with logs ?
Christophe
>
> Reviewed-by: Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@...ux.intel.com>
>
>
>
>> drivers/pci/pcie/aer.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index e286c197d7167..55abc5e17b8b1 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct
>> pci_dev *pdev,
>> static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
>> {
>> + if (!dev->aer_info)
>> + return 1;
>> +
>> switch (severity) {
>> case AER_NONFATAL:
>> return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
>>
>> ---
>> base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
>> change-id: 20250801-aer_crash_2-b21cc2ef0d00
>>
>> Best regards,
>> --
>> Breno Leitao <leitao@...ian.org>
>>
Powered by blists - more mailing lists