lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <af6f6d3d-b6ea-be46-d907-73fa4aea7b80@kernel.dk>
Date:   Thu, 19 Jan 2023 21:12:42 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     Saeed Mirzamohammadi <saeed.mirzamohammadi@...cle.com>,
        io-uring@...r.kernel.org
Cc:     asml.silence@...il.com, linux-kernel@...r.kernel.org
Subject: Re: Phoronix pts fio io_uring test regression report on upstream v6.1
 and v5.15

On 1/19/23 2:36?PM, Saeed Mirzamohammadi wrote:
> Hello,
>  
> I'm reporting a performance regression after the commit below on
> phoronix pts/fio test and with the config that is added in the end of
> this email:
> 
> Link: https://lore.kernel.org/all/20210913131123.597544850@linuxfoundation.org/
> 
> commit 7b3188e7ed54102a5dcc73d07727f41fb528f7c8
> Author: Jens Axboe axboe@...nel.dk
> Date:   Mon Aug 30 19:37:41 2021 -0600
>  
>     io_uring: IORING_OP_WRITE needs hash_reg_file set
>  
> We observed regression on the latest v6.1.y and v5.15.y upstream
> kernels (Haven't tested other stable kernels). We noticed that
> performance regression improved 45% after the revert of the commit
> above.
>  
> All of the benchmarks below have experienced around ~45% regression.
> phoronix-pts-fio-1.15.0-RandomWrite-EngineIO_uring-BufferedNo-DirectYes-BlockSize4KB-MB-s_xfs
> phoronix-pts-fio-1.15.0-SequentialWrite-EngineIO_uring-BufferedNo-DirectYes-BlockSize4KB-MB-s_xfs
> phoronix-pts-fio-1.15.0-SequentialWrite-EngineIO_uring-BufferedYes-DirectNo-BlockSize4KB-MB-s_xfs
>  
> We tend to see this regression on 4KB BlockSize tests.
>  
> We tried out changing force_async but that has no effect on the
> result. Also, backported a modified version of the patch mentioned
> here (https://lkml.org/lkml/2022/7/20/854) but that didn't affect
> performance.
>  
> Do you have any suggestions on any fixes or what else we can try to
> narrow down the issue?

This is really mostly by design - the previous approach of not hashing
buffered writes on regular files would cause a lot of inode lock
contention due to lots of threads hammering on that.

That said, for XFS, we don't need to serialize on O_DIRECT writes. Don't
think we currently have a way to detect this as it isn't really
advertised. Something like the below might work, with the caveat that
this is totally untested.


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 595a5bcf46b9..85fdc6f2efa4 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1171,7 +1171,8 @@ xfs_file_open(
 {
 	if (xfs_is_shutdown(XFS_M(inode->i_sb)))
 		return -EIO;
-	file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC;
+	file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC |
+			FMODE_ODIRECT_PARALLEL;
 	return generic_file_open(inode, file);
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c1769a2c5d70..8541b9e53c2d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -166,6 +166,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 /* File supports DIRECT IO */
 #define	FMODE_CAN_ODIRECT	((__force fmode_t)0x400000)
 
+/* File supports parallel O_DIRECT writes */
+#define	FMODE_ODIRECT_PARALLEL	((__force fmode_t)0x800000)
+
 /* File was opened by fanotify and shouldn't generate fanotify events */
 #define FMODE_NONOTIFY		((__force fmode_t)0x4000000)
 
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e680685e8a00..1409f6f69b13 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -424,7 +424,12 @@ static void io_prep_async_work(struct io_kiocb *req)
 		req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
 
 	if (req->flags & REQ_F_ISREG) {
-		if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
+		bool should_hash = def->hash_reg_file;
+
+		if (should_hash && (req->file->f_flags & O_DIRECT) &&
+		    (req->file->f_mode & FMODE_ODIRECT_PARALLEL))
+			should_hash = false;
+		if (should_hash || (ctx->flags & IORING_SETUP_IOPOLL))
 			io_wq_hash_work(&req->work, file_inode(req->file));
 	} else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) {
 		if (def->unbound_nonreg_file)

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ