lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com>
Date:   Sun, 19 Aug 2018 15:51:09 +0800
From:   Zhen Lei <thunder.leizhen@...wei.com>
To:     Robin Murphy <robin.murphy@....com>,
        Will Deacon <will.deacon@....com>,
        Joerg Roedel <joro@...tes.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        iommu <iommu@...ts.linux-foundation.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
CC:     Zhen Lei <thunder.leizhen@...wei.com>,
        LinuxArm <linuxarm@...wei.com>,
        Hanjun Guo <guohanjun@...wei.com>,
        Libin <huawei.libin@...wei.com>,
        "John Garry" <john.garry@...wei.com>
Subject: [PATCH v4 0/2] bugfix and optimization about CMD_SYNC

v3->v4:
1. create a new function arm_smmu_cmdq_build_sync_msi_cmd, it's only used to
build CMD_SYNC for CS=SIG_IRQ mode.
2. In order to observe the optimization effect, I conducted 5 tests for each
case. Although the test result is volatility, but we can still get which case
is good or bad.

Test command: fio -numjobs=8 -rw=randread -runtime=30 ... -bs=4k
Test Result: IOPS

Case 1: (without these patches)
675480
672055
665275
648610
661146

Case 2: (only apply the variant of patch 1, move arm_smmu_cmdq_build_cmd into lock)
688714
697355
632951
700540
678459

Case 3: (only apply patch 1)
721582
729226
689574
679710
727770

Case 4: (apply both patch 1 and patch 2)
734077
742868
738194
682544
740586

v2 -> v3:
Although I have no data to show how many performance will be impacted
because of arm_smmu_cmdq_build_cmd is protected by spinlock. But it's
clear that the performance is bound to drop, a memset operation and 
a complicate switch..case in the function arm_smmu_cmdq_build_cmd.

v1 -> v2:
1. move the call to arm_smmu_cmdq_build_cmd into the critical section,
   and keep itself unchange.
2. Although patch2 can make sure no two CMD_SYNCs will be adjacent,
but patch1 is still needed, see below:

cpu0			cpu1			cpu2
msidata=0
			msidata=1
			insert cmd1
						insert a TLBI command
insert cmd0
			smmu execute cmd1
						smmu execute TLBI
smmu execute cmd0
			poll timeout, because msidata=1 is overridden by
			cmd0, that means VAL=0, sync_idx=1.

Zhen Lei (2):
  iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
  iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible

 drivers/iommu/arm-smmu-v3.c | 44 ++++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

-- 
1.8.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ