[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220823080106.194739462@linuxfoundation.org>
Date: Tue, 23 Aug 2022 10:26:00 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Logan Gunthorpe <logang@...tatee.com>,
Christoph Hellwig <hch@....de>, Song Liu <song@...nel.org>,
Jens Axboe <axboe@...nel.dk>, Sasha Levin <sashal@...nel.org>
Subject: [PATCH 5.15 201/244] md: Notify sysfs sync_completed in md_reap_sync_thread()
From: Logan Gunthorpe <logang@...tatee.com>
[ Upstream commit 9973f0fa7d20269fe6fefe6333997fb5914449c1 ]
The mdadm test 07layouts randomly produces a kernel hung task deadlock.
The deadlock is caused by the suspend_lo/suspend_hi files being set by
the mdadm background process during reshape and not being cleared
because the process hangs. (Leaving aside the issue of the fragility of
freezing kernel tasks by buggy userspace processes...)
When the background mdadm process hangs it, is waiting (without a
timeout) on a change to the sync_completed file signalling that the
reshape has completed. The process is woken up a couple times when
the reshape finishes but it is woken up before MD_RECOVERY_RUNNING
is cleared so sync_completed_show() reports 0 instead of "none".
To fix this, notify the sysfs file in md_reap_sync_thread() after
MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes
it to continue and write to suspend_lo/suspend_hi to allow IO to
continue.
Signed-off-by: Logan Gunthorpe <logang@...tatee.com>
Reviewed-by: Christoph Hellwig <hch@....de>
Signed-off-by: Song Liu <song@...nel.org>
Signed-off-by: Jens Axboe <axboe@...nel.dk>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---
drivers/md/md.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4bfaf7d4977d..33946adb0d6f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9467,6 +9467,7 @@ void md_reap_sync_thread(struct mddev *mddev)
wake_up(&resync_wait);
/* flag recovery needed just to double check */
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ sysfs_notify_dirent_safe(mddev->sysfs_completed);
sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event(mddev);
if (mddev->event_work.func)
--
2.35.1
Powered by blists - more mailing lists