[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7714f272-3862-84ac-306d-86363a1c4880@gmail.com>
Date: Wed, 10 Mar 2021 23:22:57 +0300
From: Dmitry Osipenko <digetx@...il.com>
To: Nicolin Chen <nicoleotsuka@...il.com>, joro@...tes.org,
thierry.reding@...il.com, will@...nel.org,
guillaume.tucker@...labora.com
Cc: vdumpa@...dia.com, jonathanh@...dia.com,
linux-tegra@...r.kernel.org, iommu@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] iommu/tegra-smmu: Fix mc errors on tegra124-nyan
10.03.2021 22:13, Dmitry Osipenko пишет:
> 19.02.2021 01:07, Nicolin Chen пишет:
>> Commit 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()")
>> removed certain hack in the tegra_smmu_probe() by relying on IOMMU core to
>> of_xlate SMMU's SID per device, so as to get rid of tegra_smmu_find() and
>> tegra_smmu_configure() that are typically done in the IOMMU core also.
>>
>> This approach works for both existing devices that have DT nodes and other
>> devices (like PCI device) that don't exist in DT, on Tegra210 and Tegra3
>> upon testing. However, Page Fault errors are reported on tegra124-Nyan:
>>
>> tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40:
>> EMEM address decode error (SMMU translation error [--S])
>> tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40:
>> Page fault (SMMU translation error [--S])
>>
>> After debugging, I found that the mentioned commit changed some function
>> callback sequence of tegra-smmu's, resulting in enabling SMMU for display
>> client before display driver gets initialized. I couldn't reproduce exact
>> same issue on Tegra210 as Tegra124 (arm-32) differs at arch-level code.
>>
>> Actually this Page Fault is a known issue, as on most of Tegra platforms,
>> display gets enabled by the bootloader for the splash screen feature, so
>> it keeps filling the framebuffer memory. A proper fix to this issue is to
>> 1:1 linear map the framebuffer memory to IOVA space so the SMMU will have
>> the same address as the physical address in its page table. Yet, Thierry
>> has been working on the solution above for a year, and it hasn't merged.
>>
>> Therefore, let's partially revert the mentioned commit to fix the errors.
>>
>> The reason why we do a partial revert here is that we can still set priv
>> in ->of_xlate() callback for PCI devices. Meanwhile, devices existing in
>> DT, like display, will go through tegra_smmu_configure() at the stage of
>> bus_set_iommu() when SMMU gets probed(), as what it did before we merged
>> the mentioned commit.
>>
>> Once we have the linear map solution for framebuffer memory, this change
>> can be cleaned away.
>>
>> [Big thank to Guillaume who reported and helped debugging/verification]
>>
>> Fixes: 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()")
>> Reported-by: Guillaume Tucker <guillaume.tucker@...labora.com>
>> Signed-off-by: Nicolin Chen <nicoleotsuka@...il.com>
>> ---
>>
>> Guillaume, would you please give a "Tested-by" to this change? Thanks!
>>
>> drivers/iommu/tegra-smmu.c | 72 +++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 71 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
>> index 4a3f095a1c26..97eb62f667d2 100644
>> --- a/drivers/iommu/tegra-smmu.c
>> +++ b/drivers/iommu/tegra-smmu.c
>> @@ -798,10 +798,70 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
>> return SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova);
>> }
>>
>> +static struct tegra_smmu *tegra_smmu_find(struct device_node *np)
>> +{
>> + struct platform_device *pdev;
>> + struct tegra_mc *mc;
>> +
>> + pdev = of_find_device_by_node(np);
>> + if (!pdev)
>> + return NULL;
>> +
>> + mc = platform_get_drvdata(pdev);
>> + if (!mc)
>> + return NULL;
>> +
>> + return mc->smmu;
>> +}
>> +
>> +static int tegra_smmu_configure(struct tegra_smmu *smmu, struct device *dev,
>> + struct of_phandle_args *args)
>> +{
>> + const struct iommu_ops *ops = smmu->iommu.ops;
>> + int err;
>> +
>> + err = iommu_fwspec_init(dev, &dev->of_node->fwnode, ops);
>> + if (err < 0) {
>> + dev_err(dev, "failed to initialize fwspec: %d\n", err);
>> + return err;
>> + }
>> +
>> + err = ops->of_xlate(dev, args);
>> + if (err < 0) {
>> + dev_err(dev, "failed to parse SW group ID: %d\n", err);
>> + iommu_fwspec_free(dev);
>> + return err;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
>> {
>> - struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>> + struct device_node *np = dev->of_node;
>> + struct tegra_smmu *smmu = NULL;
>> + struct of_phandle_args args;
>> + unsigned int index = 0;
>> + int err;
>> +
>> + while (of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
>> + &args) == 0) {
>> + smmu = tegra_smmu_find(args.np);
>> + if (smmu) {
>> + err = tegra_smmu_configure(smmu, dev, &args);
>> + of_node_put(args.np);
>>
>> + if (err < 0)
>> + return ERR_PTR(err);
>> +
>> + break;
>> + }
>> +
>> + of_node_put(args.np);
>> + index++;
>> + }
>> +
>> + smmu = dev_iommu_priv_get(dev);
>> if (!smmu)
>> return ERR_PTR(-ENODEV);
>>
>> @@ -1028,6 +1088,16 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>> if (!smmu)
>> return ERR_PTR(-ENOMEM);
>>
>> + /*
>> + * This is a bit of a hack. Ideally we'd want to simply return this
>> + * value. However the IOMMU registration process will attempt to add
>> + * all devices to the IOMMU when bus_set_iommu() is called. In order
>> + * not to rely on global variables to track the IOMMU instance, we
>> + * set it here so that it can be looked up from the .probe_device()
>> + * callback via the IOMMU device's .drvdata field.
>> + */
>> + mc->smmu = smmu;
>> +
>> size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
>>
>> smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
>>
>
> I found that this patch introduced a serious regression on Tegra30 using
> today's linux-next. Tegra30 has two 3d h/w blocks connected in SLI and
> only one of the blocks is now attached to IOMMU domain, meaning that GPU
> is unusable now. All 3d, 2d and display devices share the same "DRM"
> group on Tegra30.
>
> Nicolin, please let me know if have any suggestions. I may take a closer
> look a day later, for now I'll just revert this patch locally. Thanks in
> advance.
>
Actually, this was easy to fix:
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 97eb62f667d2..639d5ceab60b 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -853,8 +853,6 @@ static struct iommu_device
*tegra_smmu_probe_device(struct device *dev)
if (err < 0)
return ERR_PTR(err);
-
- break;
}
of_node_put(args.np);
Nicolin, please make a proper patch. You may add my reported-by and
tested-by.
Powered by blists - more mailing lists