lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 20 Sep 2014 20:41:46 +0200
From:	Dirk Gouders <dirk@...ders.net>
To:	Bjorn Helgaas <bhelgaas@...gle.com>
Cc:	Yinghai Lu <yinghai@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andreas Noever <andreas.noever@...il.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	"linux-pci\@vger.kernel.org" <linux-pci@...r.kernel.org>
Subject: Re: [BUG] Bisected Problem with LSI PCI FC Adapter

Bjorn Helgaas <bhelgaas@...gle.com> writes:

> On Sat, Sep 13, 2014 at 09:41:34PM +0200, Dirk Gouders wrote:
>> So, I did some tests on the VX50 which probably wasn't the worst idea,
>> because it behaves different than the test machine.
>> 
>> Summary:
>> 
>> 1) Bjorn's back pocket patch works on the VX50.
>> 
>>    On the test machine it causes a trace, mount_root has to do with
>>    it.  I tried to use netconsole but it complained the interface were
>>    not ready.
>
> OK, that's good.  I put this revert patch in for-linus for v3.17.  I regard
> this as a temporary fix, not the real solution.  My guess is the test
> machine doesn't boot because you're missing a driver, so not related to the
> revert patch.

I assumed my limit-host-bridge-aperture-and-ignore-bridges-patch on top
of your patch caused this, so I took a closer look.

Your patch works fine with current rc5+ on the test machine -- with and
without my additional patch.

rc2 and "make oldconfig" somehow caused that the root partition couldn't
be mounted.  With rc5+ everything is fine, again, without touching the
configuration myself.

Other various today's test results (VX50) will be appended to bugzilla
in a few moments.

Dirk

>> 3) Reset with Bjorn's commands
>> 
>>    DEV=00:0e.0
>>    setpci -s$DEV BRIDGE_CONTROL.W=0x0040
>>    sleep 1
>>    setpci -s$DEV BRIDGE_CONTROL.W=0x0000
>>    sleep 1
>>    echo 1 > /sys/bus/pci/rescan
>> 
>>    let the FC adapter appear but there are errors that I cannot really
>>    interpret.
>> 
>> 4) Reset with Yinghai's patches and 
>> 
>>    echo 1 > /sys/bus/pci/devices/0000\:00\:0e.0/pcie_link_disable
>>    echo 0 > /sys/bus/pci/devices/0000\:00\:0e.0/pcie_link_disable
>>    echo 1 > /sys/bus/pci/rescan
>> 
>>    gives a similar resut to 3).
>
> Resetting the device or simply disabling and re-enabling the link was
> enough to fix things from the device's perspective.  In both cases, it
> responded as one would expect:
>
>   pci_scan_child_bus: pci_bus 0000:06: scanning bus
>   pci 0000:06:00.0: [1000:0646] type 00 class 0x0c0400
>   pci 0000:06:00.0: reg 0x10: [io  0x0000-0x00ff] 
>   pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00003fff 64bit]
>   pci 0000:06:00.0: reg 0x1c: [mem 0x00000000-0x0000ffff 64bit]
>   pci 0000:06:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
>
> Linux tried to assign MMIO space to the device, but failed:
>
>   pci 0000:06:00.0: BAR 6: assigned [mem 0xd4200000-0xd42fffff pref]
>   pci 0000:06:00.0: BAR 3: no space for [mem size 0x00010000 64bit]
>   pci 0000:06:00.0: BAR 3: failed to assign [mem size 0x00010000 64bit]
>   pci 0000:06:00.0: BAR 1: no space for [mem size 0x00004000 64bit]
>   pci 0000:06:00.0: BAR 1: failed to assign [mem size 0x00004000 64bit]
>
> The upstream bridge windows are:
>
>   pci 0000:00:0e.0: PCI bridge to [bus 06]	# was originally to bus 0a
>   pci 0000:00:0e.0:   bridge window [io  0x3000-0x3fff] 
>   pci 0000:00:0e.0:   bridge window [mem 0xd4200000-0xd42fffff]
>
> So the ROM BAR (reg 0x30/BAR 6) takes up the whole window, leaving nothing
> for BARs 1 and 3.  This is something that Linux could do better.  For
> example, we could assign normal BARs first, followed by ROM BARs, since the
> normal ones are more important.  It's possible we could also try to expand
> the bridge window so all the BARs would fit.
>
> In any case, resetting the device is not a simple fix all by itself.  So
> our possibilities are:
>
>   1) Quirk to work around _CRS bug.  This works but requires us to maintain
>      CPU-specific code that I really don't want.
>
>   2) Reset device when changing bus number.  This works from the device
>      point of view, but would require additional Linux changes.
>
>   3) Revert 1820ffdccb9b.  This works but is ugly because we're ignoring
>      some of what _CRS tells us.
>
> Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists