[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Dec 2011 11:17:54 +0800
From: Yongqiang Yang <xiaoqiangnk@...il.com>
To: Andrea Arcangeli <aarcange@...hat.com>
Cc: linux-ext4@...r.kernel.org, Theodore Tso <tytso@....edu>,
Jan Kara <jack@...e.cz>
Subject: Re: [PATCH 2/2] ext4: ext4_discard_partial_page_buffers_no_lock()
wrong parameters
Hi Andrea,
The code you are testing are removed in recent patches, the patches
have not been merged.
Please try following patches:
[PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize
[PATCH 2/2] ext4: let ext4_discard_partial_buffers handle pages
without buffers correctly
and
[PATCH 1/2] ext4: remove a wrong BUG_ON in ext4_ext_convert_to_initialized
[PATCH 2/2] ext4: let ext4_bio_write_page handle EOF correctly
Yongqiang.
On Mon, Dec 12, 2011 at 10:27 AM, Andrea Arcangeli <aarcange@...hat.com> wrote:
> If "copied" is zero (it can happen if the pte is unmapped before the
> atomic copy_user that does the data copy runs) the "from" passed to
> ext4_discard_partial_page_buffers_no_lock() points to pos-1, which
> would correspond to a logical page index before the page->index
> leading to ext4_discard_partial_page_buffers_no_lock() returning
> -EINVAL (because index != page->index). In such a case write() returns
> -EINVAL and userland gets a failure and filemap.c doesn't retry the
> copy_user anymore.
>
> I'm not certain of why exactly
> ext4_discard_partial_page_buffers_no_lock() is run here, so it's hard
> to tell if this is the correct fix. But it that functions clears data
> starting from the "from" parameter, so regardless of the -EINVAL
> retval, the right "from" to start clearing data should be pos+copied,
> not pos+copied-1. If this assumption is correct, it could mean that
> this bug in addition to the -EINVAL error, could also zero out 1 byte
> by mistake. I'm not sure what the implications for that are (not sure
> if data corruption is possible in some circumstances because of
> that). I guess normally this functions runs on unmapped buffers and
> the EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED makes it a noop on those.
>
> After fixing the hang in ext4_da_should_update_i_disksize, this write
> -EINVAL error becomes trivially reproducible with experimental knumad
> autonuma code running at heavy frequency (not the normal case). But
> like for the ext4_da_should_update_i_disksize hang, it should not be
> impossible to reproduce it with legacy swapping behavior.
>
> After dropping all caches a md5sum is successful and I can't find
> errors anymore with this patch, the -EINVAL stops, but it's not
> conclusive (and I haven't run e2fsck -f yet but I doubt this affects
> metadata coherency, seems more like a delayalloc data issue).
>
> Signed-off-by: Andrea Arcangeli <aarcange@...hat.com>
> ---
> fs/ext4/inode.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 63f9541..528c4c5 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2534,11 +2534,11 @@ static int ext4_da_write_end(struct file *file,
> page, fsdata);
>
> page_len = PAGE_CACHE_SIZE -
> - ((pos + copied - 1) & (PAGE_CACHE_SIZE - 1));
> + ((pos + copied) & (PAGE_CACHE_SIZE - 1));
>
> - if (page_len > 0) {
> + if (page_len < PAGE_CACHE_SIZE) {
> ret = ext4_discard_partial_page_buffers_no_lock(handle,
> - inode, page, pos + copied - 1, page_len,
> + inode, page, pos + copied, page_len,
> EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED);
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists