lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200907211541.38694.rjw@sisk.pl>
Date:	Tue, 21 Jul 2009 15:41:37 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Alan Jenkins <alan-jenkins@...fmail.co.uk>,
	Jens Axboe <jens.axboe@...cle.com>
Cc:	Heiko Carstens <heiko.carstens@...ibm.com>,
	linux-kernel@...r.kernel.org,
	"Hans-Joachim Picht" <hans@...ux.vnet.ibm.com>,
	pm list <linux-pm@...ts.linux-foundation.org>,
	Arnd Schneider <arnd@...ibm.com>, linux-fsdevel@...r.kernel.org
Subject: Re: PM/hibernate swapfile regression

On Tuesday 21 July 2009, Alan Jenkins wrote:
> Heiko Carstens wrote:
> > On Fri, Jul 17, 2009 at 02:08:46PM +0100, Alan Jenkins wrote:
> >   
> >> Rafael J. Wysocki wrote:
> >>     
> >>> On Tuesday 14 July 2009, Heiko Carstens wrote:
> >>>   
> >>>       
> >>>> We've seen this bug [...]
> >>>>
> >>>> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap
> >>>> breaks after hibernation failures"".
> >>>> Calling bdget while holding a spinlock doesn't seem to be a good idea...
> >>>>     
> >>>>         
> >>> Agreed, sorry for missing that.
> >>>
> >>> Alan, can you please prepare a fix?
> >>>       
> >> Here's a quick & dirty patch [...]
> >>     
> >
> > Thanks for the patch. Unfortunately Arnd was unable to reproduce the original
> > behaviour. But your patch makes sense anyway.
> > I also tested it and nothing broke. So should this go upstream?  
> 
> I do want to fix it, but I think there's a better way.
> 
> It doesn't really need the sleeping bdget().  All we want is
> atomic_inc(&bdev->bd_inode->i_count).  I think we should call it bdcopy().

Makes sense to me, but IMO that requres an ACK from Jens.

Jens, is the patch below fine with you?

Rafael


> ---------->
> From 643014ec079610a8b01dfd78c6949c1e8727195b Mon Sep 17 00:00:00 2001
> From: Alan Jenkins <alan-jenkins@...fmail.co.uk>
> Date: Tue, 21 Jul 2009 10:17:30 +0100
> Subject: [PATCH] PM/hibernate: replace bdget call with bdcopy (simple atomic_inc of i_count)
> 
> Create bdcopy().  This function copies an existing reference to a
> block_device.  It is safe to call from any context.
> 
> Hibernation code wishes to copy a reference to the active swap device.
> Right now it calls bdget() under a spinlock, but this is wrong because
> bdget() can sleep.  It doesn't need a full bdget() because we already
> hold a reference to active swap devices (and the spinlock protects
> against swapoff).
> 
> Signed-off-by: Alan Jenkins <alan-jenkins@...fmail.co.uk>
> CC: linux-fsdevel@...r.kernel.org
> ---
>  fs/block_dev.c     |    8 ++++++++
>  include/linux/fs.h |    1 +
>  mm/swapfile.c      |    4 ++--
>  3 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 3a6d4fb..0b04974 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -564,6 +564,14 @@ struct block_device *bdget(dev_t dev)
>  
>  EXPORT_SYMBOL(bdget);
>  
> +struct block_device *bdcopy(struct block_device *bdev)
> +{
> +	atomic_inc(&bdev->bd_inode->i_count);
> +	return bdev;
> +}
> +
> +EXPORT_SYMBOL(bdcopy);
> +
>  long nr_blockdev_pages(void)
>  {
>  	struct block_device *bdev;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 0872372..eeb1091 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1946,6 +1946,7 @@ extern void putname(const char *name);
>  extern int register_blkdev(unsigned int, const char *);
>  extern void unregister_blkdev(unsigned int, const char *);
>  extern struct block_device *bdget(dev_t);
> +extern struct block_device *bdcopy(struct block_device *bdev);
>  extern void bd_set_size(struct block_device *, loff_t size);
>  extern void bd_forget(struct inode *inode);
>  extern void bdput(struct block_device *);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index d1ade1a..272ea8e 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -753,7 +753,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
>  
>  		if (!bdev) {
>  			if (bdev_p)
> -				*bdev_p = bdget(sis->bdev->bd_dev);
> +				*bdev_p = bdcopy(sis->bdev);
>  
>  			spin_unlock(&swap_lock);
>  			return i;
> @@ -765,7 +765,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
>  					struct swap_extent, list);
>  			if (se->start_block == offset) {
>  				if (bdev_p)
> -					*bdev_p = bdget(sis->bdev->bd_dev);
> +					*bdev_p = bdcopy(sis->bdev);
>  
>  				spin_unlock(&swap_lock);
>  				bdput(bdev);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ