linux-ext4 - Re: task hung in ext4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ab4f311b-9700-4d3d-8f2e-09ccbcfb3df5@kernel.dk>
Date:   Tue, 24 Oct 2023 12:35:26 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Dave Chinner <david@...morbit.com>,
        Andres Freund <andres@...razel.de>
Cc:     Theodore Ts'o <tytso@....edu>,
        Thorsten Leemhuis <regressions@...mhuis.info>,
        Shreeya Patel <shreeya.patel@...labora.com>,
        linux-ext4@...r.kernel.org,
        Ricardo Cañuelo <ricardo.canuelo@...labora.com>,
        gustavo.padovan@...labora.com, zsm@...gle.com, garrick@...gle.com,
        Linux regressions mailing list <regressions@...ts.linux.dev>,
        io-uring@...r.kernel.org
Subject: Re: task hung in ext4_fallocate #2

On 10/24/23 8:30 AM, Jens Axboe wrote:
> I don't think this is related to the io-wq workers doing non-blocking
> IO. The callback is eventually executed by the task that originally
> submitted the IO, which is the owner and not the async workers. But...
> If that original task is blocked in eg fallocate, then I can see how
> that would potentially be an issue.
> 
> I'll take a closer look.

I think the best way to fix this is likely to have inode_dio_wait() be
interruptible, and return -ERESTARTSYS if it should be restarted. Now
the below is obviously not a full patch, but I suspect it'll make ext4
and xfs tick, because they should both be affected.

Andres, any chance you can throw this into the testing mix?


diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 202c76996b62..0d946b6d36fe 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4747,7 +4747,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	}
 
 	/* Wait all existing dio workers, newcomers will block on i_rwsem */
-	inode_dio_wait(inode);
+	ret = inode_dio_wait(inode);
+	if (ret)
+		goto out;
 
 	ret = file_modified(file);
 	if (ret)
diff --git a/fs/inode.c b/fs/inode.c
index 84bc3c76e5cc..c4eca812b16b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2417,17 +2417,24 @@ EXPORT_SYMBOL(inode_owner_or_capable);
 /*
  * Direct i/o helper functions
  */
-static void __inode_dio_wait(struct inode *inode)
+static int __inode_dio_wait(struct inode *inode)
 {
 	wait_queue_head_t *wq = bit_waitqueue(&inode->i_state, __I_DIO_WAKEUP);
 	DEFINE_WAIT_BIT(q, &inode->i_state, __I_DIO_WAKEUP);
+	int ret = 0;
 
 	do {
-		prepare_to_wait(wq, &q.wq_entry, TASK_UNINTERRUPTIBLE);
-		if (atomic_read(&inode->i_dio_count))
-			schedule();
+		prepare_to_wait(wq, &q.wq_entry, TASK_INTERRUPTIBLE);
+		if (!atomic_read(&inode->i_dio_count))
+			break;
+		schedule();
+		if (signal_pending(current)) {
+			ret = -ERESTARTSYS;
+			break;
+		}
 	} while (atomic_read(&inode->i_dio_count));
 	finish_wait(wq, &q.wq_entry);
+	return ret;
 }
 
 /**
@@ -2440,10 +2447,11 @@ static void __inode_dio_wait(struct inode *inode)
  * Must be called under a lock that serializes taking new references
  * to i_dio_count, usually by inode->i_mutex.
  */
-void inode_dio_wait(struct inode *inode)
+int inode_dio_wait(struct inode *inode)
 {
 	if (atomic_read(&inode->i_dio_count))
-		__inode_dio_wait(inode);
+		return __inode_dio_wait(inode);
+	return 0;
 }
 EXPORT_SYMBOL(inode_dio_wait);
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 203700278ddb..8ea0c414b173 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -936,7 +936,9 @@ xfs_file_fallocate(
 	 * the on disk and in memory inode sizes, and the operations that follow
 	 * require the in-memory size to be fully up-to-date.
 	 */
-	inode_dio_wait(inode);
+	error = inode_dio_wait(inode);
+	if (error)
+		goto out_unlock;
 
 	/*
 	 * Now AIO and DIO has drained we flush and (if necessary) invalidate
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4a40823c3c67..7dff3167cb0c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2971,7 +2971,7 @@ static inline ssize_t blockdev_direct_IO(struct kiocb *iocb,
 }
 #endif
 
-void inode_dio_wait(struct inode *inode);
+int inode_dio_wait(struct inode *inode);
 
 /**
  * inode_dio_begin - signal start of a direct I/O requests

-- 
Jens Axboe