lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 13 Jan 2016 09:20:04 +0100
From:	Hannes Reinecke <hare@...e.de>
To:	Babu Moger <babu.moger@...cle.com>,
	Bjorn Helgaas <helgaas@...nel.org>
Cc:	Alexander Duyck <alexander.duyck@...il.com>,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] PCI VPD access fixes

Hi Babu,

On 01/12/2016 11:15 PM, Babu Moger wrote:
>
>
> On 1/12/2016 11:23 AM, Babu Moger wrote:
>> Hannes,
>> These patches cause some other issues. I still have to figure
>> out what is going on here. I have seen same problem before also with your
>> older patches. That is the reason I thought setting the vpd length to 0
>> is an easy solution.  Here are is what panic stack looks like. I am
>> looking at it now. Let me know if you see anything obvious here.
>>
>> megasas: 06.807.10.00-rc1
>> megasas: Waiting for FW to come to ready state
>> megasas: FW now in Ready state
>> NON-RESUMABLE ERROR: Reporting on cpu 3
>> NON-RESUMABLE ERROR: TPC [0x0000000000730990] <msix_capability_init+0x250/0x360>
>> NON-RESUMABLE ERROR: RAW [0003100000000001:0006ccae1076de96:0000000202000080:ffffffffffffffff
>> NON-RESUMABLE ERROR:      0000000800030000:0000000100000000:0000000000000000:0000000000000000]
>> NON-RESUMABLE ERROR: handle [0x0003100000000001] stick [0x0006ccae1076de96]
>> NON-RESUMABLE ERROR: type [precise nonresumable]
>> NON-RESUMABLE ERROR: attrs [0x02000080] < ASI sp-faulted priv >
>> NON-RESUMABLE ERROR: raddr [0xffffffffffffffff]
>> NON-RESUMABLE ERROR: insn effective address [0x000008510079700c]
>> NON-RESUMABLE ERROR: size [0x8]
>> NON-RESUMABLE ERROR: asi [0x00]
>> CPU: 3 PID: 325 Comm: modprobe Tainted: G            E   4.1.12-uek4-bm #5
>> task: fff8000ff37bda00 ti: fff8000ff2940000 task.ti: fff8000ff2940000
>> TSTATE: 0000000011001607 TPC: 0000000000730990 TNPC: 0000000000730994 Y: 00000000    Tainted: G            E
>> TPC: <msix_capability_init+0x250/0x360>
>> g0: 0000000000000000 g1: fff8000ff33109c0 g2: 0000000000000000 g3: 000008510079700c
>> g4: fff8000ff37bda00 g5: fff8000ffe58a000 g6: fff8000ff2940000 g7: 0000085100797000
>> o0: fff8000ffd60d000 o1: 000000000000ffff o2: 000000000000ffff o3: 000000000111e678
>> o4: 000000000000001e o5: 0000000000000000 sp: fff8000ff2942711 ret_pc: 0000000000730940
>> RPC: <msix_capability_init+0x200/0x360>
>> l0: fff8000ffd60d888 l1: fff8000ff28123ec l2: fff8000ffd60d888 l3: 00000000000080d0
>> l4: 00000000010eace0 l5: 0000000000000001 l6: 0080000000000000 l7: 8000000000000000
>> i0: fff8000ffd60d000 i1: fff8000ff28123e4 i2: 0000000000000001 i3: 0000000000000000
>> i4: 0000000000000000 i5: 0000085100797000 i6: fff8000ff29427d1 i7: 0000000000730c08
>> I7: <pci_enable_msix+0x168/0x1c0>
>> Call Trace:
>>   [0000000000730c08] pci_enable_msix+0x168/0x1c0
>>   [0000000000730c80] pci_enable_msix_range+0x20/0x60
>>   [00000000100d7554] megasas_init_fw+0x374/0xb40 [megaraid_sas]
>>   [00000000100d8c80] megasas_probe_one+0x4a0/0x9a0 [megaraid_sas]
>>   [0000000000721e74] local_pci_probe+0x34/0xa0
>>   [0000000000721f88] pci_call_probe+0xa8/0xe0
>>   [0000000000722270] pci_device_probe+0x50/0x80
>>   [000000000079b660] really_probe+0x140/0x420
>>   [000000000079b984] driver_probe_device+0x44/0xa0
>>   [000000000079ba68] __driver_attach+0x88/0xa0
>>   [000000000079986c] bus_for_each_dev+0x6c/0xa0
>>   [000000000079b1fc] driver_attach+0x1c/0x40
>>   [000000000079a2bc] bus_add_driver+0x17c/0x220
>>   [000000000079c234] driver_register+0x74/0x120
>>   [000000000072235c] __pci_register_driver+0x3c/0x60
>>   [00000000100f00ac] megasas_init+0xac/0x1dc [megaraid_sas]
>> Kernel panic - not syncing: Non-resumable error.
>> CPU: 3 PID: 325 Comm: modprobe Tainted: G            E   4.1.12-uek4-bm #5
>> Call Trace:
>>   [00000000009c8314] panic+0xb4/0x248
>>   [0000000000429638] sun4v_nonresum_error+0xb8/0xe0
>>   [000000000040744c] sun4v_nonres_mondo+0xc8/0xd8
>>   [0000000000730990] msix_capability_init+0x250/0x360
>>   [0000000000730c08] pci_enable_msix+0x168/0x1c0
>>   [0000000000730c80] pci_enable_msix_range+0x20/0x60
>>   [00000000100d7554] megasas_init_fw+0x374/0xb40 [megaraid_sas]
>>   [00000000100d8c80] megasas_probe_one+0x4a0/0x9a0 [megaraid_sas]
>>   [0000000000721e74] local_pci_probe+0x34/0xa0
>>   [0000000000721f88] pci_call_probe+0xa8/0xe0
>>   [0000000000722270] pci_device_probe+0x50/0x80
>>   [000000000079b660] really_probe+0x140/0x420
>>   [000000000079b984] driver_probe_device+0x44/0xa0
>>   [000000000079ba68] __driver_attach+0x88/0xa0
>>   [000000000079986c] bus_for_each_dev+0x6c/0xa0
>>   [000000000079b1fc] driver_attach+0x1c/0x40
>> Press Stop-A (L1-A) to return to the boot prom
>> ---[ end Kernel panic - not syncing: Non-resumable error.
>>
>>
>>
>> On 1/12/2016 8:42 AM, Hannes Reinecke wrote:
>>> Hi all,
>>>
>>> the current PCI VPD page access assumes that the entire possible VPD
>>> data is readable. However, the spec only guarantees a VPD data up to
>>> the 'end' marker, with everything beyond that being undefined.
>>> This causes a system lockup on certain devices.
>>>
>>> With this patch we calculate the actual VPD size, or set it to '0'
>>> if no valid VPD data could be read.
>>>
>>> Hannes Reinecke (3):
>>>    pci: Update VPD definitions
>>>    pci: Update VPD size with correct length
>>>    pci: set VPD size to '0' if PCI_VPD_FLAGS_VPD_REF_F0 is set
>>>
>>>   drivers/pci/access.c    | 68 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   drivers/pci/pci-sysfs.c | 20 +++++++++------
>>>   include/linux/pci.h     | 27 ++++++++++++++++++--
>>>   3 files changed, 104 insertions(+), 11 deletions(-)
>>>
>
> Hannes, I think the logic to to get the PCI vpd length seems fine. I see that
> the function pci_vpd_pci22_size returns zero seeing the invalid tag. That looks
> good. However, this card(LSI Logic / Symbios Logic MegaRAID SAS 2108 controllers)
> locks-up when we attempt to read pci vpd area. So when the driver tries to
> initialize the card it times out and panics eventually.
>
WHAT? Which brain-dead engineer designed _that_?

> I tried to set up a new flag(PCI_DEV_FLAGS_VPD_ZERO) for this device using
>   DECLARE_PCI_FIXUP_CLASS_EARLY.
>
> Changes in pci_vpd_pci22_init
>
>   if ((dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) ||
>                  (dev->dev_flags & PCI_DEV_FLAGS_VPD_ZERO))
>                  vpd->base.len = 0;
>          else
>                  vpd->base.len = pci_vpd_pci22_size(dev);
>          return 0;
>
> Changes in driver/quirk.c
> static void quirk_set_vpd_sero(struct pci_dev *dev)
> {
>          dump_stack();
>          dev->dev_flags |= PCI_DEV_FLAGS_VPD_ZERO;
> }
>
> DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_LSI_LOGIC, 0x0060,
>                  PCI_CLASS_STORAGE_RAID, 8, quirk_set_vpd_sero);
>
> Even this did not help. I don't know sequence of these flags setting.
> When we are inside the function pci_vpd_pci22_init none of these flags
> are set up.  I need to look at these early quirks again.
>
Yes, of course it didn't.
Thing is, setting the attribute size to '0' doesn't mean there is no 
data, it just means "we don't know how much data we have".
IE you can still read from that attribute (cf patch 3/3 in that 
series) and the mentioned stall is triggered.

Actually, I've been thinking about this, too; with my current 
patchset we don't make any difference between 'no data' (eg when 
reading past the end) and 'invalid data' (eg when the card returns 
garbage or worse)

So I guess I should modify my patch to return -EINVAL if the data is 
garbage, and integrate your patchset, too.

Let me see ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@...e.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ