[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b7dd1f8c-f1b0-af65-bd6d-914b7e42d75d@amd.com>
Date: Fri, 9 Dec 2022 11:37:32 -0800
From: Lizhi Hou <lizhi.hou@....com>
To: Rob Herring <robh@...nel.org>
CC: <linux-pci@...r.kernel.org>, <devicetree@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <frowand.list@...il.com>,
<helgaas@...nel.org>, <clement.leger@...tlin.com>,
<max.zhen@....com>, <sonal.santan@....com>, <larry.liu@....com>,
<brian.xu@....com>, <stefano.stabellini@...inx.com>,
<trix@...hat.com>
Subject: Re: [RESEND PATCH RFC V4 2/3] PCI: Create device tree node for
selected devices
On 12/7/22 14:44, Rob Herring wrote:
> On Mon, Dec 5, 2022 at 6:30 PM Lizhi Hou <lizhi.hou@....com> wrote:
>>
>> On 12/1/22 13:12, Rob Herring wrote:
>>> On Mon, Nov 21, 2022 at 08:43:03AM -0800, Lizhi Hou wrote:
>>>> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
>>>> spaces from multiple hardware peripherals to its PCI BAR. Normally,
>>>> the PCI core discovers devices and BARs using the PCI enumeration process.
>>>> There is no infrastructure to discover the hardware peripherals that are
>>>> present in a PCI device, and which can be accessed through the PCI BARs.
>>>>
>>>> For Alveo PCI card, the card firmware provides a flattened device tree to
>>>> describe the hardware peripherals on its BARs. The Alveo card driver can
>>>> load this flattened device tree and leverage device tree framework to
>>>> generate platform devices for the hardware peripherals eventually.
>>>>
>>>> Apparently, the device tree framework requires a device tree node for the
>>>> PCI device. Thus, it can generate the device tree nodes for hardware
>>>> peripherals underneath. Because PCI is self discoverable bus, there might
>>>> not be a device tree node created for PCI devices. This patch is to add
>>>> support to generate device tree node for PCI devices.
>>>>
>>>> Added a kernel option. When the option is turned on, the kernel will
>>>> generate device tree nodes for PCI bridges unconditionally.
>>>>
>>>> Initially, the basic properties are added for the dynamically generated
>>>> device tree nodes.
>>>>
>>>> Signed-off-by: Lizhi Hou <lizhi.hou@....com>
>>>> Signed-off-by: Sonal Santan <sonal.santan@....com>
>>>> Signed-off-by: Max Zhen <max.zhen@....com>
>>>> Reviewed-by: Brian Xu <brian.xu@....com>
>>>> ---
>>>> drivers/pci/Kconfig | 12 ++
>>>> drivers/pci/Makefile | 1 +
>>>> drivers/pci/bus.c | 2 +
>>>> drivers/pci/msi/irqdomain.c | 6 +-
>>>> drivers/pci/of.c | 71 ++++++++++
>>>> drivers/pci/of_property.c | 256 ++++++++++++++++++++++++++++++++++++
>>>> drivers/pci/pci-driver.c | 3 +-
>>>> drivers/pci/pci.h | 19 +++
>>>> drivers/pci/remove.c | 1 +
>>>> 9 files changed, 368 insertions(+), 3 deletions(-)
>>>> create mode 100644 drivers/pci/of_property.c
>>>>
>>>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>>>> index 55c028af4bd9..126c31b79718 100644
>>>> --- a/drivers/pci/Kconfig
>>>> +++ b/drivers/pci/Kconfig
>>>> @@ -198,6 +198,18 @@ config PCI_HYPERV
>>>> The PCI device frontend driver allows the kernel to import arbitrary
>>>> PCI devices from a PCI backend to support PCI driver domains.
>>>>
>>>> +config PCI_DYNAMIC_OF_NODES
>>>> + bool "Device tree node for PCI devices"
>>> Create Devicetree nodes for PCI devices
>> Sure.
>>> But as I've said before, making this a config option doesn't really work
>>> except as something to experiment with. Once you add your driver and
>>> want to do a 'select PCI_DYNAMIC_OF_NODES', you've affected everyone
>>> else.
>> Do you mean we should remove PCI_DYNAMIC_OF_NODES and make
>> creating dynamic tree node default?
> No, I'm saying as long as it is a config option, it's not useful for
> more than experimentation. A distro kernel has to decide how to set a
> config option for *everyone*.
Ok. I will keep this option.
>
>> Based on the previous discussions, the approach I am implementing
>> is to create device tree nodes for all PCI bridges and devices defined
>> by pci quirks. I believe a PCI endpoint which is not defined by PCI quirks
>> should not to be affected because there is no device tree node is created
>> for it.
>>
>> Are you fine with this approach?
> How does that work? The quirks run when a device is discovered. At
> that time you've already discovered and probed everything upstream of
> the device. So the only thing controlling the upstream devices getting
> a DT node is the config option, right?
Yes, correct.
I meant that the endpoint devices which are not defined in quirks will
not have devicetree node generated. Thus, the behavior of those endpoint
device driver will not be affected.
>
>>>> + depends on OF
>>>> + select OF_DYNAMIC
>>>> + help
>>>> + This option enables support for generating device tree nodes for some
>>>> + PCI devices. Thus, the driver of this kind can load and overlay
>>>> + flattened device tree for its downstream devices.
>>>> +
>>>> + Once this option is selected, the device tree nodes will be generated
>>>> + for all PCI bridges.
>>>> +
>>>> choice
>>>> prompt "PCI Express hierarchy optimization setting"
>>>> default PCIE_BUS_DEFAULT
>>>> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
>>>> index 2680e4c92f0a..cc8b4e01e29d 100644
>>>> --- a/drivers/pci/Makefile
>>>> +++ b/drivers/pci/Makefile
>>>> @@ -32,6 +32,7 @@ obj-$(CONFIG_PCI_P2PDMA) += p2pdma.o
>>>> obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o
>>>> obj-$(CONFIG_VGA_ARB) += vgaarb.o
>>>> obj-$(CONFIG_PCI_DOE) += doe.o
>>>> +obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o
>>>>
>>>> # Endpoint library must be initialized before its users
>>>> obj-$(CONFIG_PCI_ENDPOINT) += endpoint/
>>>> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>>>> index 3cef835b375f..8507cc32b61d 100644
>>>> --- a/drivers/pci/bus.c
>>>> +++ b/drivers/pci/bus.c
>>>> @@ -316,6 +316,8 @@ void pci_bus_add_device(struct pci_dev *dev)
>>>> */
>>>> pcibios_bus_add_device(dev);
>>>> pci_fixup_device(pci_fixup_final, dev);
>>>> + if (pci_is_bridge(dev))
>>>> + of_pci_make_dev_node(dev);
>>>> pci_create_sysfs_dev_files(dev);
>>>> pci_proc_attach_device(dev);
>>>> pci_bridge_d3_update(dev);
>>>> diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
>>>> index e9cf318e6670..eeaf44169bfd 100644
>>>> --- a/drivers/pci/msi/irqdomain.c
>>>> +++ b/drivers/pci/msi/irqdomain.c
>>>> @@ -230,8 +230,10 @@ u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev)
>>>> pci_for_each_dma_alias(pdev, get_msi_id_cb, &rid);
>>>>
>>>> of_node = irq_domain_get_of_node(domain);
>>>> - rid = of_node ? of_msi_map_id(&pdev->dev, of_node, rid) :
>>>> - iort_msi_map_id(&pdev->dev, rid);
>>>> + if (of_node && !of_node_check_flag(of_node, OF_DYNAMIC))
>>>> + rid = of_msi_map_id(&pdev->dev, of_node, rid);
>>>> + else
>>>> + rid = iort_msi_map_id(&pdev->dev, rid);
>>> I have no idea if this works. It looks kind of broken already if
>>> !of_node calls iort_msi_map_id(). With a DT only system, I would think
>>> we'd always call of_msi_map_id(). Have you tested MSIs?
>>>
>>> With a mixed system, I have no idea what happens. That needs to be
>>> understood for MSI, DMA, and interrupts.
>> Yes, I tested MSI in VM.
>>
>> # cat
>> /sys/devices/platform/3f000000.pcie/pci0000:00/0000:00:02.0/0000:09:00.0/0000:0a:00.0/msi_irqs/29
>>
>> msi
>> # cat /proc/interrupts | grep 29
>> 29: 1 0 MSI 5242880 Edge pciehp
>>
>> The idea is to preserve the current behaviror.
>>
>> current: PCI device does not have dt node, thus iort_msi_map_id()
>> is called.
>>
>> modified code: PCI device has dt node but with OF_DYNAMIC flag,
>> thus iort_msi_map_id() is called.
>>
>> I was planning to take on of_msi_map_id() for dynamically generated dt node
>>
>> in future when we see a real use case?
>>
>>>> return rid;
>>>> }
>>>> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
>>>> index 196834ed44fe..fb60b04f0b93 100644
>>>> --- a/drivers/pci/of.c
>>>> +++ b/drivers/pci/of.c
>>>> @@ -469,6 +469,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
>>>> } else {
>>>> /* We found a P2P bridge, check if it has a node */
>>>> ppnode = pci_device_to_OF_node(ppdev);
>>>> + if (of_node_check_flag(ppnode, OF_DYNAMIC))
>>>> + ppnode = NULL;
>>>> }
>>>>
>>>> /*
>>>> @@ -599,6 +601,75 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
>>>> return pci_parse_request_of_pci_ranges(dev, bridge);
>>>> }
>>>>
>>>> +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
>>>> +
>>>> +void of_pci_remove_node(struct pci_dev *pdev)
>>>> +{
>>>> + struct device_node *dt_node;
>>> node or np are the typical names.
>> Will fix this.
>>>> +
>>>> + dt_node = pci_device_to_OF_node(pdev);
>>>> + if (!dt_node || !of_node_check_flag(dt_node, OF_DYNAMIC))
>>>> + return;
>>>> + pdev->dev.of_node = NULL;
>>>> +
>>>> + of_destroy_node(dt_node);
>>>> +}
>>>> +
>>>> +void of_pci_make_dev_node(struct pci_dev *pdev)
>>>> +{
>>>> + struct device_node *parent, *dt_node = NULL;
>>>> + const char *pci_type = "dev";
>>>> + struct of_changeset *cset;
>>>> + const char *full_name;
>>>> + int ret;
>>>> +
>>>> + /*
>>>> + * If there is already a device tree node linked to this device,
>>>> + * return immediately.
>>>> + */
>>>> + if (pci_device_to_OF_node(pdev))
>>>> + return;
>>>> +
>>>> + /* Check if there is device tree node for parent device */
>>>> + if (!pdev->bus->self)
>>>> + parent = pdev->bus->dev.of_node;
>>>> + else
>>>> + parent = pdev->bus->self->dev.of_node;
>>>> + if (!parent)
>>>> + return;
>>>> +
>>>> + if (pci_is_bridge(pdev))
>>>> + pci_type = "pci";
>>> What's the node name if not a bridge? I don't see how that would work.
>>>
>>> It should depend on the device class if it has one.
>> pci_type is initialized with "dev"
>>
>> + const char *pci_type = "dev";
> I missed that...
>
>> Do you mean I should use class instead of pci_is_bridge()?
>>
>> if ((pdev->class >> 24) == PCI_BASE_CLASS_BRIDGE)
>> pci_type = "pci";
> Well, maybe as preparation to support other classes. If you had a UART
> for example, the node name is 'serial'. I don't think you need to
> worry about those ATM. We may need some way for the name to come from
> the driver as not all devices have a class. Yours for example, we'd
> want something like 'fpga@...' ideally. Maybe that's fine to leave as
> a known issue.
Got it. I will keep generic endpoint name 'dev' for now. Thanks.
Lizhi
>
> Rob
Powered by blists - more mailing lists