[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bba67239-fb4c-7ab5-c0e8-1df37b39bfac@opensource.wdc.com>
Date: Wed, 22 Feb 2023 07:01:34 +0900
From: Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To: Rick Wertenbroek <rick.wertenbroek@...il.com>
Cc: alberto.dassatti@...g-vd.ch, xxm@...k-chips.com,
rick.wertenbroek@...g-vd.ch, Rob Herring <robh+dt@...nel.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
Heiko Stuebner <heiko@...ech.de>,
Shawn Lin <shawn.lin@...k-chips.com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Krzysztof WilczyĆski <kw@...ux.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Jani Nikula <jani.nikula@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Rodrigo Vivi <rodrigo.vivi@...el.com>,
Mikko Kovanen <mikko.kovanen@...amobile.com>,
devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-rockchip@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe
endpoint core set_msi()
On 2/22/23 01:37, Rick Wertenbroek wrote:
> On Tue, Feb 21, 2023 at 2:19 PM Rick Wertenbroek
> <rick.wertenbroek@...il.com> wrote:
>>
>> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
>> <damien.lemoal@...nsource.wdc.com> wrote:
>>>
>>> On 2/21/23 19:47, Rick Wertenbroek wrote:
>>>> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
>>>> <damien.lemoal@...nsource.wdc.com> wrote:
>>>>>
>>>>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>>>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>>>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>>>>> called with a function number greater than 0.
>>>>>> The PCIe standard only allows the multi message capability (MMC) value
>>>>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>>>>> called with a MMC value of over 0x5.
>>>>>>
>>>>>> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@...il.com>
>>>>>> ---
>>>>>> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>>>>> 1 file changed, 10 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> index b7865a94e..80634b690 100644
>>>>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>>>>> struct rockchip_pcie *rockchip = &ep->rockchip;
>>>>>> u32 flags;
>>>>>>
>>>>>> + if (fn) {
>>>>>> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>
>>>>> Checking this here is late... Given that at most only one physical
>>>>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>>>>> Something like:
>>>>>
>>>>> err = of_property_read_u8(dev->of_node, "max-functions",
>>>>> &ep->epc->max_functions);
>>>>>
>>>>> if (err < 0 || ep->epc->max_functions > 1)
>>>>>
>>>>> ep->epc->max_functions = 1;
>>>>>
>>>>
>>>> Yes, this could be moved to the probe, thanks.
>>>>
>>>>> And all the macros with the (fn) argument could also be simplified
>>>>> (argument fn removed) since fn will always be 0.
>>>>
>>>> These functions cannot be simplified because they have to follow the signature
>>>> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
>>>> function number as a parameter. If we change the function signature we won't
>>>> be able to assign these functions to the pc_epc_ops structure
>>>
>>> I was not suggesting to change the functions signature. I was suggesting
>>> dropping the fn argument for the *macros*, e.g.
>>>
>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>>>
>>> since fn is always 0.
>>>
>>> That said, I am not entirely sure if the limit really is 1 function at most. The
>>> TRM seems to be suggesting that up to 4 functions can be supported...
>>>
>>> [...]
>>>
>>>>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>>>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>>>>> here all the time.
>>>>
>>>> Yes, this could be an improvement but this is the way it is written
>>>> everywhere in this
>>>> driver, I chose to keep it so as to remain coherent with the rest of the driver.
>>>> Cleaning this is not so important since this code will not be
>>>> rewritten / changed so
>>>> often. But I agree that it might be nicer. But, on the other side if
>>>> at some point
>>>> support for virtual functions would be added, the offsets would need
>>>> to be computed
>>>> based on the virtual function number and the code would be written
>>>> like it is now,
>>>> so I suggest keeping this the way it is for now.
>>>
>>> Yes, sure, this can be cleaned later.
>>>
>>> A more pressing problem is the lack of support for MSIX despite the fact that
>>> the controller supports that *and* advertize it as a capability. That is what
>>> was causing my problem with the Linux nvme driver and my prototype nvme epf
>>> function driver: the host driver was seeing MSIX support (1 vector supported by
>>> default), and so was allocating one MSIX for the device probe. But on the EP
>>> end, it is MSI or INTX only... Working on adding that to solve this issue.
>>>
>>
>> I have seen this too, the controller advertises the capability. However, the TRM
>> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
>> So the solution should be to modify the probe function of the endpoint
>> controller
>> so that the MSI-X capability would not be advertised, this should fix
>> your problem.
>>
>> I wonder if one could still implement MSI-X because from the endpoint we would
>> just need to implement it as a message (TLP) over PCIe (because the space for
>> the vectors is allocated and written, so that part should be ok). I am
>> not an expert
>> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
>> requires some fields in the PCIe header descriptor to be filled with values that
>> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
>>
>> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
>> send such interrupts. Once this is fixed you should be able to have your NVMe
>> function running.
>>
>> Regards.
>> Rick
>>
>
> It is possible to disable MSI-X by skipping the MSI-X capability
> structure in the PCIe
> capabilities structures linked-list.
> The current linked list is MSI cap (0x90) -> MSI-X cap (0xb0) -> PCIe
> Device cap (0xc0)
> So we want to set MSI (0x90) -> PCIe Device cap (0xc0)
>
> This can be done by writing 0xc0 to bits 15-8 of 0xFDA0'0090 (MSI cap).
> I tested this quickly through devmem2 before loading the endpoint
> function driver
> and it fixes the issue of MSI-X capabilities being advertised to the host.
>
> In the driver the changes would look like this;
> In the probe function you can disable MSI-X as follows:
>
> @@ -631,6 +618,28 @@ static int rockchip_pcie_ep_probe(struct
> platform_device *pdev)
>
> ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
>
> + /*
> + * Disable MSI-X because the controller is not capable of MSI-X
> + * This requires to skip the MSI-X capabilities entry in the
s/capabilities/capability
> + * chain of PCIe capabilities, we get the next pointer from the
> + * MSI-X entry and set that in the MSI capability entry, this way
> + * the MSI-X entry is skipped (left out of the linked-list)
> + */
> + cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
> + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> +
> + cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK;
> +
> + cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
> + ROCKCHIP_PCIE_EP_MSIX_CAP_REG) &
> ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK;
> +
> + cfg_msi |= cfg_msix_cp;
> +
> + rockchip_pcie_write(rockchip, cfg_msi,
> + PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> +
> rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
> PCIE_CLIENT_CONFIG);
>
> return 0;
> err_epc_mem_exit:
> pci_epc_mem_exit(epc);
>
> In the pcie-rockchip.h add the following #defines:
>
> @@ -216,21 +227,28 @@
> #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
> #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
> +#define ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET 8
> +#define ROCKCHIP_PCIE_EP_MSI_CP1_MASK GENMASK(15, 8)
> +#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK GENMASK(22, 20)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG 0xb0
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET 8
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK GENMASK(15, 8)
> #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
> #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3
>
> I will add this to the next version of the patch set.
> Thank you Damien for pointing this out ! This should solve the issues
> you have with
> your NVMe endpoint function regarding MSI-X interrupts.
OK. I replied to your previous mail with the same idea. Looks good :)
Will test that instead of my dirty hack that puts 0 in the MSIX capability ID.
>
> Regards
> Rick
>
>>
>>> --
>>> Damien Le Moal
>>> Western Digital Research
>>>
--
Damien Le Moal
Western Digital Research
Powered by blists - more mailing lists