lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 8 Jun 2022 10:27:56 -0600 From: Logan Gunthorpe <logang@...tatee.com> To: linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org, Song Liu <song@...nel.org> Cc: Christoph Hellwig <hch@...radead.org>, Donald Buczek <buczek@...gen.mpg.de>, Guoqing Jiang <guoqing.jiang@...ux.dev>, Xiao Ni <xni@...hat.com>, Stephen Bates <sbates@...thlin.com>, Martin Oliveira <Martin.Oliveira@...eticom.com>, David Sloan <David.Sloan@...eticom.com>, Logan Gunthorpe <logang@...tatee.com>, Christoph Hellwig <hch@....de> Subject: [PATCH v4 11/11] md: Notify sysfs sync_completed in md_reap_sync_thread() The mdadm test 07layouts randomly produces a kernel hung task deadlock. The deadlock is caused by the suspend_lo/suspend_hi files being set by the mdadm background process during reshape and not being cleared because the process hangs. (Leaving aside the issue of the fragility of freezing kernel tasks by buggy userspace processes...) When the background mdadm process hangs it, is waiting (without a timeout) on a change to the sync_completed file signalling that the reshape has completed. The process is woken up a couple times when the reshape finishes but it is woken up before MD_RECOVERY_RUNNING is cleared so sync_completed_show() reports 0 instead of "none". To fix this, notify the sysfs file in md_reap_sync_thread() after MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes it to continue and write to suspend_lo/suspend_hi to allow IO to continue. Signed-off-by: Logan Gunthorpe <logang@...tatee.com> Reviewed-by: Christoph Hellwig <hch@....de> --- drivers/md/md.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index ec46b83adf29..6e583f41caff 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9476,6 +9476,7 @@ void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held) wake_up(&resync_wait); /* flag recovery needed just to double check */ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + sysfs_notify_dirent_safe(mddev->sysfs_completed); sysfs_notify_dirent_safe(mddev->sysfs_action); md_new_event(); if (mddev->event_work.func) -- 2.30.2
Powered by blists - more mailing lists