lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210810183529.GC3296@willie-the-truck>
Date:   Tue, 10 Aug 2021 19:35:30 +0100
From:   Will Deacon <will@...nel.org>
To:     Zhen Lei <thunder.leizhen@...wei.com>
Cc:     Robin Murphy <robin.murphy@....com>,
        Joerg Roedel <joro@...tes.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        iommu <iommu@...ts.linux-foundation.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC 0/8] iommu/arm-smmu-v3: add support for ECMDQ
 register mode

On Sat, Jun 26, 2021 at 07:01:22PM +0800, Zhen Lei wrote:
> SMMU v3.3 added a new feature, which is Enhanced Command queue interface
> for reducing contention when submitting Commands to the SMMU, in this
> patch set, ECMDQ is the abbreviation of Enhanced Command Queue.
> 
> When the hardware supports ECMDQ and each core can exclusively use one ECMDQ,
> each core does not need to compete with other cores when using its own ECMDQ.
> This means that each core can insert commands in parallel. If each ECMDQ can
> execute commands in parallel, the overall performance may be better. However,
> our hardware currently does not support multiple ECMDQ execute commands in
> parallel.
> 
> In order to reuse existing code, I originally still call arm_smmu_cmdq_issue_cmdlist()
> to insert commands. Even so, however, there was a performance improvement of nearly 12%
> in strict mode.
> 
> The test environment is the EMU, which simulates the connection of the 200 Gbit/s NIC.
> Number of queues:    passthrough   lazy   strict(ECMDQ)  strict(CMDQ)
>       6                  188        180       162           145        --> 11.7% improvement
>       8                  188        188       184           183        --> 0.55% improvement

Sorry, I don't quite follow the numbers here. Why does the number of queues
affect the classic "CMDQ" mode? We only have one queue there, right?

> In recent days, I implemented a new function without competition with other
> cores to replace arm_smmu_cmdq_issue_cmdlist() when a core can have an ECMDQ.
> I'm guessing it might get better performance results. Because the EMU is too
> slow, it will take a while before the relevant data is available.

I'd certainly prefer to wait until we have something we know is
representative. However, I can take the first four prep patches now if you
respin the second one. At least that's then less for you to carry.

I'd also like review from the Arm side on this (and thank you for adopting
the architecture unlike others seem to have done judging by the patches
floating around).

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ