lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 12 Jan 2015 16:20:51 +0100
From:	Andreas Hartmann <andihartmann@...enet.de>
To:	Alex Williamson <alex.williamson@...hat.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>
CC:	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] PCI: quirk Atheros AR93xx to avoid bus reset

Alex Williamson wrote:
> On Thu, 2015-01-08 at 09:07 -0700, Bjorn Helgaas wrote:
>> On Fri, Nov 21, 2014 at 11:24:27AM -0700, Alex Williamson wrote:
>>> Reports against the TL-WDN4800 card indicate that PCI bus reset of
>>> this Atheros device cause system lock-ups and resets.  I've also
>>> been able to confirm this behavior on multiple systems.  The device
>>> never returns from reset and attempts to access config space of the
>>> device after reset result in hangs.  Blacklist bus reset for the
>>> device to avoid this issue.
>>>
>>> Reported-by: Andreas Hartmann <andihartmann@...enet.de>
>>> Signed-off-by: Alex Williamson <alex.williamson@...hat.com>
>>> Tested-by: Andreas Hartmann <andihartmann@...enet.de>
>>
>> If I understand correctly, these two (patches 3 & 4) fix a v3.14 regression
>> caused by 425c1b223dac ("PCI: Add Virtual Channel to save/restore support").
>>
>> If so, these should go to for-linus for v3.19.  What about patches 1 & 2?
>> Do they fix a regression?  Is there a pointer to a bugzilla or problem
>> report about that issue?
>>
>> I don't understand the connection between 425c1b223dac and
>> PCI_DEV_FLAGS_NO_BUS_RESET, because 425c1b223dac doesn't seem to do any
>> resets.  Is that the wrong commit, or can you outline the connection for
>> me?
> 
> TBH, I don't have a lot of faith in associating this to 425c1b223dac,
> I'm not sure how Andreas' bisect landed there. 

Because removing this patch made it working again :-)

And too:
http://thread.gmane.org/gmane.linux.kernel.pci/35170/focus=35984

Kernel 2.10. and 2.12. and 2.13. did work fine for me. 2.14 is the first
kernel, which hangs the machine at startup of the VM. The userland
(qemu) didn't change in between.

Therefore: from my point of view, it is a regression, because things
have been working < 2.14.

Besides that: It is undoubted, that there is a problem with resetting
this card. But the difference between >= 3.14 and < 3.14 is, that < 3.14
has been working nevertheless. The patch
425c1b223dac456d00a61fd6b451b6d1cf00d065 obviously changed something
which I can't say and I don't know off. Therefore, the quirk-patch is
definitely required, because things work completely fine again w/ this
patch.

"Working" means for me here: I was able to start (and use) the VM w/o
crashing the machine and this isn't possible w/ unpatched 2.14+ any
more. Yes, w/ 2.12, I wasn't able to restart the VM (it then crashed the
machine), but w/ 2.10 even this was possible.


> IME, this device cannot,
> and has never been able to handle a bus reset.  A simple setpci
> experiment on the commandline can confirm this.  What I think happened
> is that with the PCI bus reset infrastructure we added, we switched QEMU
> to prefer PCI bus resets over things like PM D3hot->D0 resets.  So it's
> just more prolific use of bus resets by userspace.
> 
> There's also no regression in 1 & 2, PM reset has never done anything
> useful on those devices.  Thanks,
> 
> Alex
> 
>>> ---
>>>
>>>  drivers/pci/quirks.c |   14 ++++++++++++++
>>>  1 file changed, 14 insertions(+)
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index 561e10d..ebbd5b4 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -3029,6 +3029,20 @@ static void quirk_no_pm_reset(struct pci_dev *dev)
>>>  DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_ATI, PCI_ANY_ID,
>>>  			       PCI_CLASS_DISPLAY_VGA, 8, quirk_no_pm_reset);
>>>  
>>> +static void quirk_no_bus_reset(struct pci_dev *dev)
>>> +{
>>> +	dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
>>> +}
>>> +
>>> +/*
>>> + * Atheros AR93xx chips do not behave after a bus reset.  The device will
>>> + * throw a Link Down error on AER capable system and regardless of AER,
>>> + * config space of the device is never accessible again and typically
>>> + * causes the system to hang or reset when access is attempted.
>>> + * http://www.spinics.net/lists/linux-pci/msg34797.html
>>> + */
>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, quirk_no_bus_reset);
>>> +
>>>  #ifdef CONFIG_ACPI
>>>  /*
>>>   * Apple: Shutdown Cactus Ridge Thunderbolt controller.
>>>
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ