lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 11 Aug 2021 10:07:27 +0800
From:   "Leizhen (ThunderTown)" <thunder.leizhen@...wei.com>
To:     Will Deacon <will@...nel.org>
CC:     Robin Murphy <robin.murphy@....com>,
        Joerg Roedel <joro@...tes.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        iommu <iommu@...ts.linux-foundation.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC 0/8] iommu/arm-smmu-v3: add support for ECMDQ register
 mode



On 2021/8/11 2:35, Will Deacon wrote:
> On Sat, Jun 26, 2021 at 07:01:22PM +0800, Zhen Lei wrote:
>> SMMU v3.3 added a new feature, which is Enhanced Command queue interface
>> for reducing contention when submitting Commands to the SMMU, in this
>> patch set, ECMDQ is the abbreviation of Enhanced Command Queue.
>>
>> When the hardware supports ECMDQ and each core can exclusively use one ECMDQ,
>> each core does not need to compete with other cores when using its own ECMDQ.
>> This means that each core can insert commands in parallel. If each ECMDQ can
>> execute commands in parallel, the overall performance may be better. However,
>> our hardware currently does not support multiple ECMDQ execute commands in
>> parallel.
>>
>> In order to reuse existing code, I originally still call arm_smmu_cmdq_issue_cmdlist()
>> to insert commands. Even so, however, there was a performance improvement of nearly 12%
>> in strict mode.
>>
>> The test environment is the EMU, which simulates the connection of the 200 Gbit/s NIC.
>> Number of queues:    passthrough   lazy   strict(ECMDQ)  strict(CMDQ)
>>       6                  188        180       162           145        --> 11.7% improvement
>>       8                  188        188       184           183        --> 0.55% improvement
> 
> Sorry, I don't quite follow the numbers here. Why does the number of queues
> affect the classic "CMDQ" mode? We only have one queue there, right?

These queues indicates the network concurrency, maybe I should use channels or threads.
6 means six threads are deployed on different cores using their own channels to send
and receive network packets.

> 
>> In recent days, I implemented a new function without competition with other
>> cores to replace arm_smmu_cmdq_issue_cmdlist() when a core can have an ECMDQ.
>> I'm guessing it might get better performance results. Because the EMU is too
>> slow, it will take a while before the relevant data is available.
> 
> I'd certainly prefer to wait until we have something we know is
> representative. 

Yes, it would be better to have an actual set of performance data. Now the EMU is
used to analyze hardware problems. This test has not been numbered yet.

> However, I can take the first four prep patches now if you
> respin the second one. At least that's then less for you to carry.

Great. Thank you. I will respin the second one.

> 
> I'd also like review from the Arm side on this (and thank you for adopting
> the architecture unlike others seem to have done judging by the patches
> floating around).
> 
> Will
> .
> 

Powered by blists - more mailing lists