lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100415084757.GA3561@quack.suse.cz>
Date:	Thu, 15 Apr 2010 10:47:57 +0200
From:	Jan Kara <jack@...e.cz>
To:	Anton Blanchard <anton@...ba.org>
Cc:	Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Jens Axboe <jens.axboe@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Fix regression in O_DIRECT|O_SYNC writes to block
 devices

On Thu 15-04-10 14:40:39, Anton Blanchard wrote:
> 
> We are seeing a large regression in database performance on recent kernels.
> The database opens a block device with O_DIRECT|O_SYNC and a number of threads
> write to different regions of the file at the same time.
> 
> A simple test case is below. I haven't defined DEVICE to anything since getting
> it wrong will destroy your data :) On an 3 disk LVM with a 64k chunk size we
> see about 17MB/sec and only a few threads in IO wait:
> 
> procs  -----io---- -system-- -----cpu------
>  r  b     bi    bo   in   cs us sy id wa st
>  0  3      0 16170  656 2259  0  0 86 14  0
>  0  2      0 16704  695 2408  0  0 92  8  0
>  0  2      0 17308  744 2653  0  0 86 14  0
>  0  2      0 17933  759 2777  0  0 89 10  0
> 
> Most threads are blocking in vfs_fsync_range, which has:
> 
>         mutex_lock(&mapping->host->i_mutex);
>         err = fop->fsync(file, dentry, datasync);
>         if (!ret)
>                 ret = err;
>         mutex_unlock(&mapping->host->i_mutex);
  ...
  Just a few style nitpicks:

> Index: linux-2.6/fs/block_dev.c
> ===================================================================
> --- linux-2.6.orig/fs/block_dev.c	2010-04-14 12:55:50.000000000 +1000
> +++ linux-2.6/fs/block_dev.c	2010-04-14 13:17:45.000000000 +1000
> @@ -406,16 +406,24 @@ static loff_t block_llseek(struct file *
>   
>  int blkdev_fsync(struct file *filp, struct dentry *dentry, int datasync)
>  {
> -	struct block_device *bdev = I_BDEV(filp->f_mapping->host);
> +	struct inode *bd_inode = filp->f_mapping->host;
> +	struct block_device *bdev = I_BDEV(bd_inode);
>  	int error;
>  
  Could you please add a comment here? Like "There is no need to
protect syncing of the block device by i_mutex and it unnecessarily
serializes workloads with several O_SYNC writers to the block device"

> +	mutex_unlock(&bd_inode->i_mutex);
> +
>  	error = sync_blockdev(bdev);
> -	if (error)
> +	if (error) {
> +		mutex_lock(&bd_inode->i_mutex);
>  		return error;
  Usually, "goto out" is preferred instead of the above.

> +	}
>  	
>  	error = blkdev_issue_flush(bdev, NULL);
>  	if (error == -EOPNOTSUPP)
>  		error = 0;
> +
And define out: here.

> +	mutex_lock(&bd_inode->i_mutex);
> +
>  	return error;
>  }
>  EXPORT_SYMBOL(blkdev_fsync);

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ