lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3eca0594-ebe4-4b34-8463-555136199dc1@gmail.com>
Date: Thu, 7 Aug 2025 08:46:10 +0800
From: Ethan Zhao <etzhao1900@...il.com>
To: Breno Leitao <leitao@...ian.org>
Cc: Mahesh J Salgaonkar <mahesh@...ux.ibm.com>,
 Oliver O'Halloran <oohall@...il.com>, Bjorn Helgaas <bhelgaas@...gle.com>,
 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>,
 Jon Pan-Doh <pandoh@...gle.com>, linuxppc-dev@...ts.ozlabs.org,
 linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH] PCI/AER: Check for NULL aer_info before ratelimiting in
 pci_print_aer()



On 8/6/2025 4:45 PM, Breno Leitao wrote:
> Hello Ethan,
> 
> On Wed, Aug 06, 2025 at 09:55:05AM +0800, Ethan Zhao wrote:
>> On 8/4/2025 5:17 PM, Breno Leitao wrote:
>>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
>>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
>>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
>>> does not rate limit, given this is fatal.
>>>
>>> This prevents a kernel crash triggered by dereferencing a NULL pointer
>>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
>>> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
>>> which already performs this NULL check.
>>>
>> The enqueue side has lock to protect the ring, but the dequeue side no lock
>> held.
>>
>> The kfifo_get in
>> static void aer_recover_work_func(struct work_struct *work)
>> {
>> ...
>> while (kfifo_get(&aer_recover_ring, &entry)) {
>> ...
>> }
>> should be replaced by
>> kfifo_out_spinlocked()
> 
> The design seems not to need the lock on the reader side. There is just
> one reader, which is the aer_recover_work. aer_recover_work runs
> aer_recover_work_func(). So, if we just have one reader, we do not need
> to protect the kfifo by spinlock, right?

Not exactly,
If the writer and reader are serialized, no lock is needed. However, 
here the writer kfifo_in_spinlocked() and the system work queue task 
aer_recover_work() cannot guarantee serialized execution.

@Bjorn, help to check it out.


Thanks,
Ethan>
> In fact, the code documents it in the aer_recover_ring_lock.
> 
> 	/*
> 	* Mutual exclusion for writers of aer_recover_ring, reader side don't
> 	* need lock, because there is only one reader and lock is not needed
> 	* between reader and writer.
> 	*/
> 	static DEFINE_SPINLOCK(aer_recover_ring_lock);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ