lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250618050401.507344-1-xue01.he@samsung.com>
Date: Wed, 18 Jun 2025 13:04:01 +0800
From: hexue <xue01.he@...sung.com>
To: axboe@...nel.dk
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [RFC] block: The Effectiveness of Plug Optimization?

The plug mechanism uses the merging of block I/O (bio) to reduce the frequency of
I/O submission to improve throughput. This mechanism can greatly reduce the disk
seek overhead of the HDD and plays a key role in optimizing the speed of IO. However,
with the improvement of storage device speed, high-performance SSD combined with
asynchronous processing mechanisms such as io_uring has achieved very fast I/O
processing speed. The delay introduced by flow control and bio merging may reduced
the throughput to a certain extent.

After testing, I found that plug increases the burden of high concurrency of SSD on
random IO and 128K sequential IO. But it still has a certain optimization effect on
small block (4k) sequential IO, of course small sequential IO is the most suitable
application for merging scenarios, but the current plug does not distinguish between
different usage scenarios.

I have made aggressive modifications to the kernel code to disable the plug mechanism
during I/O submission, the following are the random performance differences after
disabling only merging and completely disabling plug (merging and flow control):

------------------------------------------------------------------------------------
PCIe Gen4 SSD 
16GB Mem
Seq 128K
Random 4K
cmd: 
taskset -c 0 ./t/io_uring -b 131072 -d128 -c32 -s32 -R0 -p1 -F1 -B1 -n1 -r5 /dev/nvme0n1
taskset -c 0 ./t/io_uring -b 4096 -d128 -c32 -s32 -R1 -p1 -F1 -B1 -n1 -r5 /dev/nvme0n1
data unit: IOPS
------------------------------------------------------------------------------------
             Enable plug          disable merge           disable plug
Seq IO       50100                50133                   50125
Random IO    821K                 824K                    836K           -1.83%
------------------------------------------------------------------------------------

I used a higher-speed device (PCIe Gen5 server and PCIe Gen5 SSD) to verify the hypothesis
and found that the gap widened further.

------------------------------------------------------------------------------------
             Enable plug          disable merge           disable plug
Seq IO       88938                89832                   89869
Random IO    1.02M                1.022M                  1.06M          -3.92%
------------------------------------------------------------------------------------

In the current kernel, there is a certain flag (REQ_NOMERGE_FLAGS) to control whether
IO operations can be merged. However, the decision for plug selection is determined
solely by whether batch submission is enabled (state->need_plug = max_ios > 2;).
I'm wondering whether this judgment mechanism is still applicable to high-speed SSDs.

So the discussion points are:
	- Will plugs gradually disappear as hardware devices develop?
	- Is it reasonable to make flow control an optional configuration? Or could
          we change the criteria for determining when to apply plug?
	- Are there other thoughts about plug that we can talk now?

Thanks,
Xue He


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ