[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33844c9a-6721-b2fb-5514-8f04501e990e@linux.ibm.com>
Date: Thu, 9 Jun 2022 13:50:45 -0700
From: Tyrel Datwyler <tyreld@...ux.ibm.com>
To: "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
Bjorn Helgaas <helgaas@...nel.org>,
Pali Rohár <pali@...nel.org>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
Paul Mackerras <paulus@...ba.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
linuxppc-dev@...ts.ozlabs.org,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Guowen Shan <gshan@...hat.com>
Subject: Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI
domain
On 6/9/22 13:21, Guilherme G. Piccoli wrote:
> First of all, thanks for looping me Bjorn! Much appreciated.
> I'm also CCing Ben and Gavin, that know a lot of PPC PCI stuff.
>
>
> On 09/06/2022 16:34, Bjorn Helgaas wrote:
>> [...]
>>>>>>>>>>> Upgrading powerpc kernels from LTS 4.4 version (which does not
>>>>>>>>>>> contain mentioned commit) to new LTS versions brings a
>>>>>>>>>>> regression in domain assignment.
>>>>>>>>>>
>>>>>>>>>> Can you elaborate why it is a regression ?
>>>>>>>>>> 63a72284b159 That commit says 'no functionnal changes', I'm
>>>>>>>>>> having hard time understanding how a nochange can be a
>>>>>>>>>> regression.
>>>>>>>>>
>>>>>>>>> It is not 'no functional change'. That commit completely changed
>>>>>>>>> PCI domain assignment in a way that is incompatible with other
>>>>>>>>> architectures and also incompatible with the way how it was done
>>>>>>>>> prior that commit.
>>>>>>>>
>>>>>>>> I agree that the "no functional change" statement is incorrect.
>>>>>>>> However, for most powerpc platforms it ended up being simply a
>>>>>>>> cosmetic behavior change. As far as I can tell there is nothing
>>>>>>>> requiring domain ids to increase montonically from zero or that
>>>>>>>> each architecture is required to use the same domain numbering
>>>>>>>> scheme.
>
> Strongly agree here in both points: first, this was not a "no functional
> change" thing, and I apologize for adding this in the commit message.
> What I meant is that: despite changing the numbering, (as Tyrel said)
> nothing should require increasing monotonic mutable PCI domains. At
> least, I'm not aware of such requirement in any spec or even in the
> kernel and adjacent tooling.
>
>
>>>>>>> [...]
>>>>>>>> We could properly limit it to powernv and pseries by using
>>>>>>>> ibm,fw-phb-id instead of reg property in the look up that follows
>>>>>>>> a failed ibm,opal-phbid lookup. I think this is acceptable as long
>>>>>>>> as no other powerpc platforms have started using this behavior for
>>>>>>>> persistent naming.
>>>>>>>
>>>>>>> And what about setting that new config option to enabled by default
>>>>>>> for those series?
>
> I don't remember all the details about PPC dt, but it should already be
> restricted to pseries/powernv, right? At least, the commit has a comment:
>
> /* If not pseries nor powernv, or if fixed PHB numbering tried to add
> * the same PHB number twice, then fallback to dynamic PHB numbering.*/
>
> If this is *not* restricted to these 2 platforms, I agree with Pali's
> approach, although I'd consider the correct is to keep the persistent
> domain scheme for both pnv and pseries, as it's working like this for 5
> years and counting, and this *does* prevent a lot of issues with PCI
> hotplugging in PPC servers.
I mentioned this in a previous post, but it is clear the Author's intent was for
this only to apply to powernv and pseries platforms. However, it only really
checks for powernv, and if that fails it does a read the reg property for the
domain which works for and PPC platform. If we really only want this on powernv
and pseries and revert all other PPC platforms back we can fix this with a
pseries check instead of a config property. Using ibm,fw-phb-id instead of reg
property if ibm,opal-phbid lookup fails does the trick.
-Tyrel
>
>
>>> [...]
>>>> I forgot to ask before about the actual regression here. The commit
>>>> log says domain numbers are different, but I don't know the connection
>>>> from there to something failing. I assume there's some script or
>>>> config file that depends on specific domain numbers? And that
>>>> dependency is (hopefully) powerpc-specific?
>>>
>>> You assume correct. For example this is the way how OpenWRT handles PCI
>>> devices (but not only OpenWRT). This OpenWRT case is not
>>> powerpc-specific but generic to all architectures. This is just one
>>> example.
>>
>> So basically everybody uses D/b/d/f for persistent names. That's ...
>> well, somewhat stable for things soldered down or in a motherboard
>> slot, but a terrible idea for things that can be hot-plugged.
>>
>> Even for more core things, it's possible for firmware to change bus
>> numbering between boots. For example, if a complicated hierarchy is
>> cold-plugged into one slot, firmware is likely to assign different bus
>> numbers on the next boot to make room for it. Obviously this can also
>> happen as a hot-add, and Linux needs the flexibility to do similar
>> renumbering then, although we don't support it yet.
>>
>> It looks like 63a72284b159 was intended to make domain numbers *more*
>> consistent, so it's ironic that this actually broke something by
>> changing domain numbers. Maybe there's a way to limit the scope of
>> 63a72284b159 so it avoids the breakage. I don't know enough about the
>> powerpc landscape to even guess at how.
>
> I don't considereit breaks the userspace since this is definitely no
> stable ABI (or if it is, I'm not aware and my apologies). If scripts
> rely on that, they are doing the wrong thing it seems.
>
> With that said, I'm definitely not against improving the situation with
> Pali's KConfig - just think that we somehow should keep the persistent
> behavior for powernv and pseries.
> Hopefully PPC folks has more to say on that!
> Cheers,
>
>
> Guilherme
Powered by blists - more mailing lists