lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 May 2008 01:03:55 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] VFS: Pagecache usage optimization on pagesize
 !=blocksize environment

On Thu, 22 May 2008 16:31:15 +0900 Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp> wrote:

> I modified my patch based on your comment.

Looks pretty close to me.

> I added is_partially_uptodate method to address_space_operations, and
> I registered block_is_partially_uptodate in fs/buffer.c to ext2,ext3,ext4's aops.
> Thanks.
> 
> Signed-off-by :Hisashi Hifumi <hifumi.hisashi@....ntt.co.jp>
> 
> diff -Nrup linux-2.6.26-rc3.org/fs/buffer.c linux-2.6.26-rc3/fs/buffer.c
> --- linux-2.6.26-rc3.org/fs/buffer.c	2008-05-19 11:35:10.000000000 +0900
> +++ linux-2.6.26-rc3/fs/buffer.c	2008-05-22 13:23:29.000000000 +0900
> @@ -2084,6 +2084,51 @@ int generic_write_end(struct file *file,
>  EXPORT_SYMBOL(generic_write_end);
>  
>  /*
> + * block_is_partially_uptodate checks whether buffers within a page are
> + * uptodate or not.
> + *
> + * Returns true if all buffers which correspond to a file portion
> + * we want to read are uptodate.
> + */
> +int block_is_partially_uptodate(struct page *page, read_descriptor_t *desc,
> +					unsigned long from)
> +{
> +	struct inode *inode = page->mapping->host;
> +	unsigned long block_start, block_end, blocksize;
> +	unsigned long to;

These functions can get quite confusing, and the appropriate use of
types can help.

For offsets within a page I'd suggest `unsigned'.  I expect that the
32-bit quantities are more efficient on at least some 64-bit machines.

For page indices use pgoff_t.

For byte-offset-within-a-file use loff_t.

For byte-range-within-a-file use size_t.

Be careful about overflows and truncation when shifting, comparing,
assigning, etc.  Be sure that the code is still correct outside the
4Gbyte mark on 32-bit and outside the 4Gpage mark on 64-bit.

It doesn't hurt at all to put each variable definition on its own line
with a little comment off to the right.  It helps to mention what the
variable's units are too - bytes/pages/sectors/etc.

> +	struct buffer_head *bh, *head;
> +	int ret = 1;
> +
> +	if (!page_has_buffers(page))
> +		return 0;
> +
> +	blocksize = 1 << inode->i_blkbits;
> +	to = from + desc->count;
> +	if (to > PAGE_CACHE_SIZE)
> +		to = PAGE_CACHE_SIZE;
> +	if (from < blocksize && to > PAGE_CACHE_SIZE - blocksize)
> +		return 0;
> +
> +	head = page_buffers(page);
> +
> +	for (bh = head, block_start = 0; bh != head || !block_start;
> +	     block_start = block_end, bh = bh->b_this_page) {
> +		block_end = block_start + blocksize;
> +		if (block_end <= from || block_start >= to)
> +			continue;

Oh dear, you've copied one of my least favourite parts of the VFS :(
Just look at it!

Do you think it can be simplified or clarified?

> +		else {
> +			if (!buffer_uptodate(bh)) {
> +				ret = 0;
> +				break;
> +			}
> +			if (block_end >= to)
> +				break;
> +		}
> +	}
> +	return ret;
> +}

We'll need the EXPORT_SYMBOL() here.

> --- linux-2.6.26-rc3.org/fs/ext2/inode.c	2008-05-19 11:35:10.000000000 +0900
> +++ linux-2.6.26-rc3/fs/ext2/inode.c	2008-05-22 12:39:46.000000000 +0900
> @@ -791,6 +791,7 @@ const struct address_space_operations ex
>  	.direct_IO		= ext2_direct_IO,
>  	.writepages		= ext2_writepages,
>  	.migratepage		= buffer_migrate_page,
> +	.is_partially_uptodate	= block_is_partially_uptodate,

Sometime we should update Documentation/Locking to reflect the new
address_space_operation.


One other thing we should think about here is the `nobh' mode which the
extX filesystems support (although I have a feeling that Nick might
have broken this ;)) We also have data=ordered, data=writeback and
data=journal to think about.  This optimisation might not be
appropriate at all to data=journal mode, but I haven't looked into
that.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ