lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 11 Aug 2021 10:07:27 +0800 From: "Leizhen (ThunderTown)" <thunder.leizhen@...wei.com> To: Will Deacon <will@...nel.org> CC: Robin Murphy <robin.murphy@....com>, Joerg Roedel <joro@...tes.org>, linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>, iommu <iommu@...ts.linux-foundation.org>, linux-kernel <linux-kernel@...r.kernel.org> Subject: Re: [PATCH RFC 0/8] iommu/arm-smmu-v3: add support for ECMDQ register mode On 2021/8/11 2:35, Will Deacon wrote: > On Sat, Jun 26, 2021 at 07:01:22PM +0800, Zhen Lei wrote: >> SMMU v3.3 added a new feature, which is Enhanced Command queue interface >> for reducing contention when submitting Commands to the SMMU, in this >> patch set, ECMDQ is the abbreviation of Enhanced Command Queue. >> >> When the hardware supports ECMDQ and each core can exclusively use one ECMDQ, >> each core does not need to compete with other cores when using its own ECMDQ. >> This means that each core can insert commands in parallel. If each ECMDQ can >> execute commands in parallel, the overall performance may be better. However, >> our hardware currently does not support multiple ECMDQ execute commands in >> parallel. >> >> In order to reuse existing code, I originally still call arm_smmu_cmdq_issue_cmdlist() >> to insert commands. Even so, however, there was a performance improvement of nearly 12% >> in strict mode. >> >> The test environment is the EMU, which simulates the connection of the 200 Gbit/s NIC. >> Number of queues: passthrough lazy strict(ECMDQ) strict(CMDQ) >> 6 188 180 162 145 --> 11.7% improvement >> 8 188 188 184 183 --> 0.55% improvement > > Sorry, I don't quite follow the numbers here. Why does the number of queues > affect the classic "CMDQ" mode? We only have one queue there, right? These queues indicates the network concurrency, maybe I should use channels or threads. 6 means six threads are deployed on different cores using their own channels to send and receive network packets. > >> In recent days, I implemented a new function without competition with other >> cores to replace arm_smmu_cmdq_issue_cmdlist() when a core can have an ECMDQ. >> I'm guessing it might get better performance results. Because the EMU is too >> slow, it will take a while before the relevant data is available. > > I'd certainly prefer to wait until we have something we know is > representative. Yes, it would be better to have an actual set of performance data. Now the EMU is used to analyze hardware problems. This test has not been numbered yet. > However, I can take the first four prep patches now if you > respin the second one. At least that's then less for you to carry. Great. Thank you. I will respin the second one. > > I'd also like review from the Arm side on this (and thank you for adopting > the architecture unlike others seem to have done judging by the patches > floating around). > > Will > . >
Powered by blists - more mailing lists