[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240702174152.GA4740@willie-the-truck>
Date: Tue, 2 Jul 2024 18:41:53 +0100
From: Will Deacon <will@...nel.org>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: robin.murphy@....com, joro@...tes.org, jgg@...dia.com,
thierry.reding@...il.com, vdumpa@...dia.com, jonathanh@...dia.com,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, linux-tegra@...r.kernel.org
Subject: Re: [PATCH v9 5/6] iommu/arm-smmu-v3: Add in-kernel support for
NVIDIA Tegra241 (Grace) CMDQV
On Wed, Jun 12, 2024 at 02:45:32PM -0700, Nicolin Chen wrote:
> From: Nate Watterson <nwatterson@...dia.com>
>
> NVIDIA's Tegra241 Soc has a CMDQ-Virtualization (CMDQV) hardware, extending
> the standard ARM SMMU v3 IP to support multiple VCMDQs with virtualization
> capabilities. In terms of command queue, they are very like a standard SMMU
> CMDQ (or ECMDQs), but only support CS_NONE in the CS field of CMD_SYNC.
>
> Add a new tegra241-cmdqv driver, and insert its structure pointer into the
> existing arm_smmu_device, and then add related function calls in the SMMUv3
> driver to interact with the CMDQV driver.
>
> In the CMDQV driver, add a minimal part for the in-kernel support: reserve
> VINTF0 for in-kernel use, and assign some of the VCMDQs to the VINTF0, and
> select one VCMDQ based on the current CPU ID to execute supported commands.
> This multi-queue design for in-kernel use gives some limited improvements:
> up to 20% reduction of invalidation time was measured by a multi-threaded
> DMA unmap benchmark, compared to a single queue.
>
> The other part of the CMDQV driver will be user-space support that gives a
> hypervisor running on the host OS to talk to the driver for virtualization
> use cases, allowing VMs to use VCMDQs without trappings, i.e. no VM Exits.
> This is designed based on IOMMUFD, and its RFC series is also under review.
> It will provide a guest OS a bigger improvement: 70% to 90% reductions of
> TLB invalidation time were measured by DMA unmap tests running in a guest,
> compared to nested SMMU CMDQ (with trappings).
>
> However, it is very important for this in-kernel support to get merged and
> installed to VMs running on Grace-powered servers as soon as possible. So,
> later those servers would only need to upgrade their host kernels for the
> user-space support.
^^^ This is a weird paragraph to put in the commit message.
>
> As the initial version, the CMDQV driver only supports ACPI configurations.
>
> Signed-off-by: Nate Watterson <nwatterson@...dia.com>
> Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
> Co-developed-by: Nicolin Chen <nicolinc@...dia.com>
> Signed-off-by: Nicolin Chen <nicolinc@...dia.com>
> ---
> MAINTAINERS | 1 +
> drivers/iommu/Kconfig | 11 +
> drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 52 +-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 50 ++
> .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 842 ++++++++++++++++++
> 6 files changed, 945 insertions(+), 12 deletions(-)
> create mode 100644 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index aacccb376c28..ecf7af1b2df8 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -22078,6 +22078,7 @@ M: Thierry Reding <thierry.reding@...il.com>
> R: Krishna Reddy <vdumpa@...dia.com>
> L: linux-tegra@...r.kernel.org
> S: Supported
> +F: drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> F: drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> F: drivers/iommu/tegra*
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c04584be3089..e009387d3cba 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -423,6 +423,17 @@ config ARM_SMMU_V3_KUNIT_TEST
> Enable this option to unit-test arm-smmu-v3 driver functions.
>
> If unsure, say N.
> +
> +config TEGRA241_CMDQV
> + bool "NVIDIA Tegra241 CMDQ-V extension support for ARM SMMUv3"
> + depends on ACPI
> + help
> + Support for NVIDIA CMDQ-Virtualization extension for ARM SMMUv3. The
> + CMDQ-V extension is similar to v3.3 ECMDQ for multi command queues
> + support, except with virtualization capabilities.
> +
> + Say Y here if your system is NVIDIA Tegra241 (Grace) or it has the same
> + CMDQ-V extension.
> endif
>
> config S390_IOMMU
> diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile
> index 014a997753a8..55201fdd7007 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/Makefile
> +++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
> @@ -2,6 +2,7 @@
> obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
> arm_smmu_v3-objs-y += arm-smmu-v3.o
> arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
> +arm_smmu_v3-objs-$(CONFIG_TEGRA241_CMDQV) += tegra241-cmdqv.o
> arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
>
> obj-$(CONFIG_ARM_SMMU_V3_KUNIT_TEST) += arm-smmu-v3-test.o
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index ba0e24d5ffbf..430e84fe3679 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -334,6 +334,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>
> static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> {
> + if (arm_smmu_has_tegra241_cmdqv(smmu))
> + return tegra241_cmdqv_get_cmdq(smmu);
> +
> return &smmu->cmdq;
Hardcoding all these tegra-specific checks in the core driver is pretty
horrible :/
Instead, please can we do something similar to the SMMUv2 driver? That
is, tweak the probe routine to call something akin to the
arm_smmu_impl_init() function, which looks at the 'model' field pulled
out of the IORT and can then dispatch directly to a tegra-specific init
function (see, e.g. nvidia_smmu_impl_init() for SMMUv2).
>From there, you can both install function pointers into the
'arm_smmu_device' structure which can be used instead of having the the
'if (tegra)' checks in the main driver and you can also re-allocate the
structu to live inside a private structure instead of having the
backpointer.
Maybe those cmdq function pointers would be happy for other extensions
too (e.g. the ECMDQ stuff at [1]).
Will
[1] https://lore.kernel.org/r/20240425144152.52352-3-tanmay@marvell.com
Powered by blists - more mailing lists