lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Mar 2022 10:26:46 -0700
From:   Sathyanarayanan Kuppuswamy 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Eric Badger <ebadger@...estorage.com>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>,
        Russell Currey <ruscur@...sell.cc>,
        Oliver OHalloran <oohall@...il.com>, linux-pci@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] PCI/AER: Handle Multi UnCorrectable/Correctable errors
 properly



On 3/15/22 10:14 AM, Eric Badger wrote:
>>   # Prep injection data for a correctable error.
>>   $ cd /sys/kernel/debug/apei/einj
>>   $ echo 0x00000040 > error_type
>>   $ echo 0x4 > flags
>>   $ echo 0x891000 > param4
>>
>>   # Root Error Status is initially clear
>>   $ setpci -s <Dev ID> ECAP0001+0x30.w
>>   0000
>>
>>   # Inject one error
>>   $ echo 1 > error_inject
>>
>>   # Interrupt received
>>   pcieport <Dev ID>: AER: Root Error Status 0001
>>
>>   # Inject another error (within 5 seconds)
>>   $ echo 1 > error_inject
>>
>>   # No interrupt received, but "multiple ERR_COR" is now set
>>   $ setpci -s <Dev ID> ECAP0001+0x30.w
>>   0003
>>
>>   # Wait for a while, then clear ERR_COR. A new interrupt immediately
>>     fires.
>>   $ setpci -s <Dev ID> ECAP0001+0x30.w=0x1
>>   pcieport <Dev ID>: AER: Root Error Status 0002
>>
>> Currently, the above issue has been only reproduced in the ICL server
>> platform.
>>
>> [Eric: proposed reproducing steps]
> Hmm, this differs from the procedure I described on v1, and I don't
> think will work as described here.

I have attempted to modify the steps to reproduce it without returning
IRQ_NONE for all cases (which will break the functionality). But I
think I did not correct the last few steps.

How about replacing the last 3 steps with following?

  # Inject another error (within 5 seconds)
  $ echo 1 > error_inject

  # You will get a new IRQ with only multiple ERR_COR bit set
  pcieport <Dev ID>: AER: Root Error Status 0002

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ