lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090720132456.GA29023@osiris.boeblingen.de.ibm.com>
Date:	Mon, 20 Jul 2009 15:24:56 +0200
From:	Heiko Carstens <heiko.carstens@...ibm.com>
To:	Alan Jenkins <alan-jenkins@...fmail.co.uk>
Cc:	"Rafael J. Wysocki" <rjw@...k.pl>, linux-kernel@...r.kernel.org,
	Hans-Joachim Picht <hans@...ux.vnet.ibm.com>,
	pm list <linux-pm@...ts.linux-foundation.org>,
	Arnd Schneider <arnd@...ibm.com>
Subject: Re: PM/hibernate swapfile regression

On Fri, Jul 17, 2009 at 02:08:46PM +0100, Alan Jenkins wrote:
> Rafael J. Wysocki wrote:
> > On Tuesday 14 July 2009, Heiko Carstens wrote:
> >   
> >> We've seen this bug:
> >>
> >> Jul  8 13:16:02 h05lp03 kernel: BUG: sleeping function called from invalid context at /home/autobuild/BUILD/linux-2.6.30-20090707/include/linux/writeback.h:87
> >> Jul  8 13:16:02 h05lp03 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 24377, name: bash
> >> Jul  8 13:16:02 h05lp03 kernel: 3 locks held by bash/24377:
> >> Jul  8 13:16:02 h05lp03 kernel: #0:  (&buffer->mutex){+.+.+.}, at: [<0000000000276e74>] sysfs_write_file+0x4c/0x1ac
> >> Jul  8 13:16:02 h05lp03 kernel: #1:  (pm_mutex#2){+.+.+.}, at: [<000000000018f128>] hibernate+0x34/0x200
> >> Jul  8 13:16:02 h05lp03 kernel: #2:  (swap_lock){+.+.-.}, at: [<00000000001f371c>] swap_type_of+0x44/0x158
> >> Jul  8 13:16:02 h05lp03 kernel: CPU: 8 Not tainted 2.6.30-39.x.20090707-s390xdefault #1
> >> Jul  8 13:16:02 h05lp03 kernel: Process bash (pid: 24377, task: 000000012ce84240, ksp: 00000000c262bb00)
> >> Jul  8 13:16:02 h05lp03 kernel: 0000000000000000 00000000c262ba88 0000000000000002 0000000000000000
> >> Jul  8 13:16:02 h05lp03 kernel:       00000000c262bb28 00000000c262baa0 00000000c262baa0 00000000005448c4
> >> Jul  8 13:16:02 h05lp03 kernel:       0000000000000000 000000012ce84718 000000013d5bf1a8 0000000000000000
> >> Jul  8 13:16:02 h05lp03 kernel:       000000000000000d 0000000000000000 00000000c262baf8 000000000000000e
> >> Jul  8 13:16:02 h05lp03 kernel:       0000000000553da8 0000000000105600 00000000c262ba88 00000000c262bad0
> >> Jul  8 13:16:02 h05lp03 kernel: Call Trace:
> >> Jul  8 13:16:02 h05lp03 kernel: ([<00000000001054fc>] show_trace+0xf0/0x148)
> >> Jul  8 13:16:02 h05lp03 kernel: [<00000000001391ba>] __might_sleep+0x172/0x188
> >> Jul  8 13:16:02 h05lp03 kernel: [<000000000021f738>] ifind+0x88/0xe4
> >> Jul  8 13:16:02 h05lp03 kernel: [<0000000000220b0e>] iget5_locked+0x66/0x1d8
> >> Jul  8 13:16:02 h05lp03 kernel: [<000000000023b676>] bdget+0x5e/0x150
> >> Jul  8 13:16:02 h05lp03 kernel: [<00000000001f37b2>] swap_type_of+0xda/0x158
> >> Jul  8 13:16:02 h05lp03 kernel: [<0000000000192342>] swsusp_write+0x4e/0x458
> >> Jul  8 13:16:02 h05lp03 kernel: [<000000000018f254>] hibernate+0x160/0x200
> >> Jul  8 13:16:02 h05lp03 kernel: [<000000000018d8da>] state_store+0x82/0xa8
> >> Jul  8 13:16:02 h05lp03 kernel: [<0000000000276f20>] sysfs_write_file+0xf8/0x1ac
> >> Jul  8 13:16:02 h05lp03 kernel: [<000000000020663a>] vfs_write+0xae/0x15c
> >> Jul  8 13:16:02 h05lp03 kernel: [<00000000002067e0>] SyS_write+0x54/0xac
> >> Jul  8 13:16:02 h05lp03 kernel: [<0000000000117a96>] sysc_noemu+0x10/0x16
> >> Jul  8 13:16:02 h05lp03 kernel: [<00000047083e36b4>] 0x47083e36b4
> >>
> >> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap
> >> breaks after hibernation failures"".
> >> Calling bdget while holding a spinlock doesn't seem to be a good idea...
> >>     
> >
> > Agreed, sorry for missing that.
> >
> > Alan, can you please prepare a fix?
> 
> I'm not sure how to reproduce.  I tried pm-hibernate with
> CONFIG_DEBUG_SPINLOCK_SLEEP, but nothing showed up in dmesg.
> 
> Here's a quick & dirty patch. Please test (or explain how I can test it
> myself, whichever is easier :-). swap_unplug_sem is used to avoid
> holding swap_lock when calling the block device unplug function.  I
> think it can also be used for this bdget call.

Thanks for the patch. Unfortunately Arnd was unable to reproduce the original
behaviour. But your patch makes sense anyway.
I also tested it and nothing broke. So should this go upstream?


> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index d1ade1a..9176464 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -744,6 +744,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
>  	if (device)
>  		bdev = bdget(device);
> 
> +	down_read(&swap_unplug_sem);
>  	spin_lock(&swap_lock);
>  	for (i = 0; i < nr_swapfiles; i++) {
>  		struct swap_info_struct *sis = swap_info + i;
> @@ -752,10 +753,11 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
>  			continue;
> 
>  		if (!bdev) {
> +			spin_unlock(&swap_lock);
>  			if (bdev_p)
>  				*bdev_p = bdget(sis->bdev->bd_dev);
> +			up_read(&swap_unplug_sem);
> 
> -			spin_unlock(&swap_lock);
>  			return i;
>  		}
>  		if (bdev == sis->bdev) {
> @@ -764,16 +766,18 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
>  			se = list_entry(sis->extent_list.next,
>  					struct swap_extent, list);
>  			if (se->start_block == offset) {
> +				spin_unlock(&swap_lock);
>  				if (bdev_p)
>  					*bdev_p = bdget(sis->bdev->bd_dev);
> +				up_read(&swap_unplug_sem);
> 
> -				spin_unlock(&swap_lock);
>  				bdput(bdev);
>  				return i;
>  			}
>  		}
>  	}
>  	spin_unlock(&swap_lock);
> +	up_read(&swap_unplug_sem);
>  	if (bdev)
>  		bdput(bdev);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ