lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220916113428.774061-1-yukuai1@huaweicloud.com>
Date:   Fri, 16 Sep 2022 19:34:23 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     song@...nel.org, logang@...tatee.com, guoqing.jiang@...ux.dev,
        pmenzel@...gen.mpg.de
Cc:     linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
        yukuai3@...wei.com, yukuai1@...weicloud.com, yi.zhang@...wei.com
Subject: [PATCH v3 0/5] md/raid10: reduce lock contention for io

From: Yu Kuai <yukuai3@...wei.com>

Changes in v3:
 - split a patch from patch 1
 - only modify hot path in patch 3
 - wake up barrier if 'nr_pending' is decreased to 0 in
 wait_barrier_nolock(), otherwise raise_barrier() might hung.

Changes in v2:
 - add patch 1, as suggested by Logan Gunthorpe.
 - in patch 4, instead of use spin_lock/unlock in wait_event, which will
 confuse lockdep, use write_seqlock/unlock instead.
 - in patch 4, use read_seqbegin() to get seqcount instead of unusual
 usage of raw_read_seqcount().
 - test result is different from v1 in aarch64 due to retest from different
 environment.

This patchset tries to avoid that two locks are held unconditionally
in hot path.

Test environment:

Architecture:
aarch64 Huawei KUNPENG 920
x86 Intel(R) Xeon(R) Platinum 8380

Raid10 initialize:
mdadm --create /dev/md0 --level 10 --bitmap none --raid-devices 4 /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1

Test cmd:
(task set -c 0-15) fio -name=0 -ioengine=libaio -direct=1 -group_reporting=1 -randseed=2022 -rwmixread=70 -refill_buffers -filename=/dev/md0 -numjobs=16 -runtime=60s -bs=4k -iodepth=256 -rw=randread

Test result:

aarch64:
before this patchset:		3.2 GiB/s
bind node before this patchset: 6.9 Gib/s
after this patchset:		7.9 Gib/s
bind node after this patchset:	8.0 Gib/s

x86:(bind node is not tested yet)
before this patchset: 7.0 GiB/s
after this patchset : 9.3 GiB/s

Please noted that in the test machine, memory access latency is very bad
across nodes compare to local node in aarch64, which is why bandwidth
while bind node is much better.

Yu Kuai (5):
  md/raid10: Factor out code from wait_barrier() to
    stop_waiting_barrier()
  md/raid10: don't modify 'nr_waitng' in wait_barrier() for the case
    nowait
  md/raid10: prevent unnecessary calls to wake_up() in fast path
  md/raid10: fix improper BUG_ON() in raise_barrier()
  md/raid10: convert resync_lock to use seqlock

 drivers/md/raid10.c | 149 ++++++++++++++++++++++++++++----------------
 drivers/md/raid10.h |   2 +-
 2 files changed, 96 insertions(+), 55 deletions(-)

-- 
2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ