[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240816141926.GA24676@willie-the-truck>
Date: Fri, 16 Aug 2024 15:19:26 +0100
From: Will Deacon <will@...nel.org>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: robin.murphy@....com, joro@...tes.org, jgg@...dia.com,
thierry.reding@...il.com, vdumpa@...dia.com, jonathanh@...dia.com,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, linux-tegra@...r.kernel.org
Subject: Re: [PATCH v11 8/9] iommu/arm-smmu-v3: Add in-kernel support for
NVIDIA Tegra241 (Grace) CMDQV
On Tue, Aug 06, 2024 at 07:11:53PM -0700, Nicolin Chen wrote:
> From: Nate Watterson <nwatterson@...dia.com>
>
> NVIDIA's Tegra241 Soc has a CMDQ-Virtualization (CMDQV) hardware, extending
> the standard ARM SMMU v3 IP to support multiple VCMDQs with virtualization
> capabilities. In terms of command queue, they are very like a standard SMMU
> CMDQ (or ECMDQs), but only support CS_NONE in the CS field of CMD_SYNC.
>
> Add a new tegra241-cmdqv driver, and insert its structure pointer into the
> existing arm_smmu_device, and then add related function calls in the SMMUv3
> driver to interact with the CMDQV driver.
>
> In the CMDQV driver, add a minimal part for the in-kernel support: reserve
> VINTF0 for in-kernel use, and assign some of the VCMDQs to the VINTF0, and
> select one VCMDQ based on the current CPU ID to execute supported commands.
> This multi-queue design for in-kernel use gives some limited improvements:
> up to 20% reduction of invalidation time was measured by a multi-threaded
> DMA unmap benchmark, compared to a single queue.
>
> The other part of the CMDQV driver will be user-space support that gives a
> hypervisor running on the host OS to talk to the driver for virtualization
> use cases, allowing VMs to use VCMDQs without trappings, i.e. no VM Exits.
> This is designed based on IOMMUFD, and its RFC series is also under review.
> It will provide a guest OS a bigger improvement: 70% to 90% reductions of
> TLB invalidation time were measured by DMA unmap tests running in a guest,
> compared to nested SMMU CMDQ (with trappings).
>
> As the initial version, the CMDQV driver only supports ACPI configurations.
>
> Signed-off-by: Nate Watterson <nwatterson@...dia.com>
> Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
> Co-developed-by: Nicolin Chen <nicolinc@...dia.com>
> Signed-off-by: Nicolin Chen <nicolinc@...dia.com>
> ---
> MAINTAINERS | 1 +
> drivers/iommu/Kconfig | 11 +
> drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 +
> .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 868 ++++++++++++++++++
> 5 files changed, 890 insertions(+)
> create mode 100644 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
[...]
> +struct arm_smmu_device *
> +tegra241_cmdqv_acpi_dsdt_probe(struct arm_smmu_device *smmu,
> + struct acpi_iort_node *node)
> +{
> + struct resource *res;
> + int irq;
> +
> + /* Keep the pointer smmu intact if !res */
> + res = tegra241_cmdqv_find_acpi_resource(smmu, node, &irq);
> + if (!res)
> + return smmu;
Given that this probing code will end up running on non-tegra hardware
when CONFIG_TEGRA241_CMDQV is enabled, please can you move the common
part into the main driver?
Will
Powered by blists - more mailing lists