linux-kernel - Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20211220192714.GA27303@Asurada-Nvidia>
Date:   Mon, 20 Dec 2021 11:27:14 -0800
From:   Nicolin Chen <nicolinc@...dia.com>
To:     Robin Murphy <robin.murphy@....com>
CC:     <joro@...tes.org>, <will@...nel.org>, <nicoleotsuka@...il.com>,
        <thierry.reding@...il.com>, <vdumpa@...dia.com>,
        <nwatterson@...dia.com>, <jean-philippe@...aro.org>,
        <thunder.leizhen@...wei.com>, <chenxiang66@...ilicon.com>,
        <Jonathan.Cameron@...wei.com>, <yuzenghui@...wei.com>,
        <linux-kernel@...r.kernel.org>, <iommu@...ts.linux-foundation.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-tegra@...r.kernel.org>, <jgg@...dia.com>
Subject: Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA
 Grace CMDQ-V

Hi Robin,

Thank you for the reply!

On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> On 2021-11-19 07:19, Nicolin Chen wrote:
> > From: Nate Watterson <nwatterson@...dia.com>
> > 
> > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > which extends the standard ARM SMMU v3 IP to support multiple
> > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > they're used to reduce contention on a single queue. In terms
> > of command queue, they are very like the standard CMDQ/ECMDQs,
> > but only support CS_NONE in the CS field of CMD_SYNC command.
> > 
> > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > structure pointer into the existing arm_smmu_device, and then
> > adds related function calls in the arm-smmu-v3 driver.
> > 
> > In the CMDQV driver itself, this patch only adds minimal part
> > for host kernel support. Upon probe(), VINTF0 is reserved for
> > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > Then the driver will select one of VCMDQs in the VINTF0 based
> > on the CPU currently executing, to issue commands.
> 
> Is there a tangible difference to DMA API or VFIO performance?

Our testing environment is currently running on a single-core
CPU, so unfortunately we don't have a perf data at this point.

> [...]
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +{
> > +     struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > +     struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > +     u16 qidx;
> > +
> > +     /* Check error status of vintf0 */
> > +     if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > +             return &smmu->cmdq;
> > +
> > +     /*
> > +      * Select a vcmdq to use. Here we use a temporal solution to
> > +      * balance out traffic on cmdq issuing: each cmdq has its own
> > +      * lock, if all cpus issue cmdlist using the same cmdq, only
> > +      * one CPU at a time can enter the process, while the others
> > +      * will be spinning at the same lock.
> > +      */
> > +     qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
> 
> How does ordering work between queues? Do they follow a global order
> such that a sync on any queue is guaranteed to complete all prior
> commands on all queues?

CMDQV internal scheduler would insert a SYNC when (for example)
switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
not SYNC. HW has a configuration bit in the register to disable
this feature, which is by default enabled.

> The challenge to make ECMDQ useful to Linux is how to make sure that all
> the commands expected to be within scope of a future CMND_SYNC plus that
> sync itself all get issued on the same queue, so I'd be mildly surprised
> if you didn't have the same problem.

PATCH-3 in this series actually helps align the command queues,
between issued commands and SYNC, if bool sync == true. Yet, if
doing something like issue->issue->issue_with_sync, it could be
tricker.

Thanks
Nic