lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250722072431.610354-1-yukuai1@huaweicloud.com>
Date: Tue, 22 Jul 2025 15:24:25 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: dlemoal@...nel.org,
	hare@...e.de,
	tj@...nel.org,
	josef@...icpanda.com,
	axboe@...nel.dk,
	yukuai3@...wei.com
Cc: cgroups@...r.kernel.org,
	linux-block@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	yukuai1@...weicloud.com,
	yi.zhang@...wei.com,
	yangerkun@...wei.com,
	johnny.chenyi@...wei.com
Subject: [PATCH 0/6] blk-mq-sched: support request batch dispatching for sq elevator

From: Yu Kuai <yukuai3@...wei.com>

Currently, both mq-deadline and bfq have global spin lock that will be
grabbed inside elevator methods like dispatch_request, insert_requests,
and bio_merge. And the global lock is the main reason mq-deadline and
bfq can't scale very well.

For dispatch_request method, current behavior is dispatching one request at
a time. In the case of multiple dispatching contexts, This behavior, on the
one hand, introduce intense lock contention:

t1:                     t2:                     t3:
lock                    lock                    lock
// grab lock
ops.dispatch_request
unlock
                        // grab lock
                        ops.dispatch_request
                        unlock
                                                // grab lock
                                                ops.dispatch_request
                                                unlock

on the other hand, messing up the requests dispatching order:
t1:

lock
rq1 = ops.dispatch_request
unlock
                        t2:
                        lock
                        rq2 = ops.dispatch_request
                        unlock

lock
rq3 = ops.dispatch_request
unlock

                        lock
                        rq4 = ops.dispatch_request
                        unlock

//rq1,rq3 issue to disk
                        // rq2, rq4 issue to disk

In this case, the elevator dispatch order is rq 1-2-3-4, however,
such order in disk is rq 1-3-2-4, the order for rq2 and rq3 is inversed.

While dispatching request, blk_mq_get_disatpch_budget() and
blk_mq_get_driver_tag() must be called, and they are not ready to be
called inside elevator methods, hence introduce a new method like
dispatch_requests is not possible.

In conclusion, this set factor the global lock out of dispatch_request
method, and support request batch dispatch by calling the methods
multiple time while holding the lock.

nullblk setup:
modprobe null_blk nr_devices=0 &&
    udevadm settle &&
    cd /sys/kernel/config/nullb &&
    mkdir nullb0 &&
    cd nullb0 &&
    echo 0 > completion_nsec &&
    echo 512 > blocksize &&
    echo 0 > home_node &&
    echo 0 > irqmode &&
    echo 128 > submit_queues &&
    echo 1024 > hw_queue_depth &&
    echo 1024 > size &&
    echo 0 > memory_backed &&
    echo 2 > queue_mode &&
    echo 1 > power ||
    exit $?

Test script:
fio -filename=/dev/$disk -name=test -rw=randwrite -bs=4k -iodepth=32 \
  -numjobs=16 --iodepth_batch_submit=8 --iodepth_batch_complete=8 \
  -direct=1 -ioengine=io_uring -group_reporting -time_based -runtime=30

Test result: iops

|                 | deadline | bfq      |
| --------------- | -------- | -------- |
| before this set | 263k     | 124k     |
| after this set  | 475k     | 292k     |

Yu Kuai (6):
  mq-deadline: switch to use high layer elevator lock
  block, bfq: don't grab queue_lock from io path
  block, bfq: switch to use elevator lock
  elevator: factor elevator lock out of dispatch_request method
  blk-mq-sched: refactor __blk_mq_do_dispatch_sched()
  blk-mq-sched: support request batch dispatching for sq elevator

 block/bfq-cgroup.c   |   4 +-
 block/bfq-iosched.c  |  73 ++++++-------
 block/bfq-iosched.h  |   2 +-
 block/blk-ioc.c      |  43 +++++++-
 block/blk-mq-sched.c | 240 ++++++++++++++++++++++++++++++-------------
 block/blk-mq.h       |  21 ++++
 block/blk.h          |   2 +-
 block/elevator.c     |   1 +
 block/elevator.h     |   4 +-
 block/mq-deadline.c  |  58 +++++------
 10 files changed, 293 insertions(+), 155 deletions(-)

-- 
2.39.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ