lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18619.20642.697691.541752@notabene.brown>
Date:	Mon, 1 Sep 2008 12:17:06 +1000
From:	Neil Brown <neilb@...e.de>
To:	Alistair John Strachan <alistair@...zero.co.uk>
Cc:	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
	"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: md (regression): reboot/shutdown hangs

On Thursday August 28, alistair@...zero.co.uk wrote:
> Hi Neil,
> 
> Commit 2b25000bf5157c28d8591f03f0575248a8cbd900 ("Restore force switch of md 
> array to readonly at reboot time.") causes a reboot/shutdown to hang 
> indefinitely on my box. Reverting this single commit makes the problem go 
> away. It was first released with 2.6.27-rc3, I believe, and so this is a 
> regression vs 2.6.26 (Rafael CCed).
> 
> I think the problem might be because my rootfs is on a RAID5 and my distro 
> fails to stop it completely before halt/reboot.
> 
> Please let me know if there's any more information you need from me.

Thanks for the report.

I'm having trouble figuring out why this ever worked.  I must be
missing something.

I can only reproduce a hang when calling reboot when a sync is needed.
I dirty a file and then 
   reboot -f -n

This will always have blocked except between the commit that you
mention and an earlier commit which broke something which that commit
was fixing.

This is because the reboot calls do_md_stop while holding the mddev
lock, and do_md_stop calls invalidate_partition.  If this finds any dirty
data to flush, the writeout will (most likely) need to mark the
superblock as dirty first, which cannot happen while the mddev lock is
held.  So we get a deadlock.

The call to invalidate_partition should not be needed in any case except a
reboot, and in that case you really don't want it (if you wanted to
sync, you would have done that first).
So I plan to remove it.  With it gone I cannot reproduce a hang.  If
you can, I would love to hear about it.

Thanks,
NeilBrown

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8cfadc5..4790c83 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3841,8 +3841,6 @@ static int do_md_stop(mddev_t * mddev, int mode, int is_open)
 
 		del_timer_sync(&mddev->safemode_timer);
 
-		invalidate_partition(disk, 0);
-
 		switch(mode) {
 		case 1: /* readonly */
 			err  = -ENXIO;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ