lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <936d1790c94f4b9c884bc79819b8b777@svr-chch-ex1.atlnz.lc>
Date:   Mon, 24 Jun 2019 04:08:20 +0000
From:   Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To:     Thomas Petazzoni <thomas.petazzoni@...tlin.com>
CC:     Jason Cooper <jason@...edaemon.net>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Kirkwood PCI Express and bridges

Hi Thomas,

On 21/06/19 6:17 PM, Thomas Petazzoni wrote:
> Hello Chris,
> 
> On Fri, 21 Jun 2019 04:03:27 +0000
> Chris Packham <Chris.Packham@...iedtelesis.co.nz> wrote:
> 
>> I'm in the process of updating the kernel version used on our products
>> from 4.4 -> 5.1.
>>
>> We have one product that uses a Kirkwood CPU, IDT PCI bridge and Marvell
>> Switch ASIC. The Switch ASIC presents as multiple PCI devices.
>>
>> The hardware setup looks like this
>>                                        __________
>> [ Kirkwood ] --- [ IDT 5T5 ] ---+---  |          |
>>                                 +---  |  Switch  |
>>                                 +---  |          |
>>                                 +---  |__________|
>>
>> On the 4.4 based kernel things are fine
>>
>> [root@...lus flash]# lspci -t
>> -[0000:00]---01.0-[01-06]----00.0-[02-06]--+-02.0-[03]----00.0
>>                                              +-03.0-[04]----00.0
>>                                              +-04.0-[05]----00.0
>>                                              \-05.0-[06]----00.0
>>
>> But on the 5.1 based kernel things get a little weird
>>
>> [root@...lus flash]# lspci -t
>> -[0000:00]---01.0-[01-06]--+-00.0-[02-06]--
>>                              +-01.0
>>                              +-02.0-[02-06]--
>>                              +-03.0-[02-06]--
>>                              +-04.0-[02-06]--
>>                              +-05.0-[02-06]--
>>                              +-06.0-[02-06]--
>>                              +-07.0-[02-06]--
>>                              +-08.0-[02-06]--
>>                              +-09.0-[02-06]--
>>                              +-0a.0-[02-06]--
>>                              +-0b.0-[02-06]--
>>                              +-0c.0-[02-06]--
>>                              +-0d.0-[02-06]--
>>                              +-0e.0-[02-06]--
>>                              +-0f.0-[02-06]--
>>                              +-10.0-[02-06]--
>>                              +-11.0-[02-06]--
>>                              +-12.0-[02-06]--
>>                              +-13.0-[02-06]--
>>                              +-14.0-[02-06]--
>>                              +-15.0-[02-06]--
>>                              +-16.0-[02-06]--
>>                              +-17.0-[02-06]--
>>                              +-18.0-[02-06]--
>>                              +-19.0-[02-06]--
>>                              +-1a.0-[02-06]--
>>                              +-1b.0-[02-06]--
>>                              +-1c.0-[02-06]--
>>                              +-1d.0-[02-06]--
>>                              +-1e.0-[02-06]--
>>                              \-1f.0-[02-06]--+-02.0-[03]----00.0
>>                                              +-03.0-[04]----00.0
>>                                              +-04.0-[05]----00.0
>>                                              \-05.0-[06]----00.0
>>
>>
>> I'll start bisecting to see where things started going wrong. I just
>> wondered if this rings any bells for anyone.
> 
> I am almost sure that the culprit is
> 1f08673eef1236f7d02d93fcf596bb8531ef0d12 ("PCI: mvebu: Convert to PCI
> emulated bridge config space").

The problem seems to pre-date this commit. I've gone back as far as 4.18 
and the problem still exists (in fact there are more duplicate devices). 
I'll keep going back (unfortunately due to out platform being out of 
tree it's not a simple bisect).

> I still think it makes sense to share the bridge emulation code between
> the mvebu and aardvark drivers, but this sharing has required making
> the code very different, with lots of subtle differences in behavior in
> how registers are emulated.

Agreed. Bugs love to hide in duplicated code.

I will admit to being ignorant about the need for an emulated bridge. I 
know it has something to do with the type of transaction used for the 
downstream devices. I also know that these systems won't work without an 
emulated bridge.

> Unfortunately, I don't have access to one of these complicated PCI
> setup with a HW switch on the way, so I couldn't test this kind of
> setups.
> 
> Do you mind helping with figuring out what the issues are ? That would
> be really nice.

No problem. As I said I'll keep going to find a point where behaviour 
turns bad for me. I suspect we might find other problems along the way.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ