[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_7081EC6CCB41F8B0966FCEB01B7AED66C409@qq.com>
Date: Fri, 17 Nov 2023 11:28:30 +0800
From: ChenXiaoSong <chenxiaosongemail@...mail.com>
To: Trond Myklebust <trondmy@...merspace.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
Cc: "linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
"chenxiaosong@...inos.cn" <chenxiaosong@...inos.cn>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
"huangjinhui@...inos.cn" <huangjinhui@...inos.cn>,
"liuzhengyuan@...inos.cn" <liuzhengyuan@...inos.cn>,
"liuyun01@...inos.cn" <liuyun01@...inos.cn>,
"huhai@...inos.cn" <huhai@...inos.cn>,
"sashal@...nel.org" <sashal@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Anna.Schumaker@...app.com" <Anna.Schumaker@...app.com>
Subject: Re: Question about LTS 4.19 patch "89047634f5ce NFS: Don't interrupt
file writeout due to fatal errors"
On 2023/10/30 22:56, Trond Myklebust wrote:
> A refactoring is by definition a change that does not affect code
> behaviour. It is obvious that this was never intended to be such a
> patch.
>
> The reason that the bug is occurring in 4.19.x, and not in the latest
> kernels, is because the former is missing another bugfix (one which
> actually is missing a "Fixes:" tag).
>
> Can you therefore please check if applying commit 22876f540bdf ("NFS:
> Don't call generic_error_remove_page() while holding locks") fixes the
> issue.
>
> Note that the latter patch is needed in any case in order to fix a read
> deadlock (as indicated on the label).
>
> Thanks,
> Trond
>
After applying commit 22876f540bdf ("NFS: Don't call
generic_error_remove_page() while holding locks"), I encountered an
issue of infinite loop:
write ... nfs_updatepage nfs_writepage_setup nfs_setup_write_request
nfs_try_to_update_request nfs_wb_page if (clear_page_dirty_for_io(page))
// true nfs_writepage_locked // return 0 nfs_do_writepage // return 0
nfs_page_async_flush // return 0 nfs_error_is_fatal_on_server
nfs_write_error_remove_page SetPageError // instead of
generic_error_remove_page // loop begin if
(clear_page_dirty_for_io(page)) // false if (!PagePrivate(page)) //
false ret = nfs_commit_inode = 0 // loop again, never quit
before applying commit 22876f540bdf ("NFS: Don't call
generic_error_remove_page() while holding locks"),
generic_error_remove_page() will clear PG_private, and infinite loop
will never happen:
generic_error_remove_page truncate_inode_page truncate_cleanup_page
do_invalidatepage nfs_invalidate_page nfs_wb_page_cancel
nfs_inode_remove_request ClearPagePrivate(head->wb_page)
If applying this patch, are other patches required? And I cannot
reproducethe read deadlock bug that the patch want to fix, are there
specific conditions required to reproduce this read deadlock bug?
Powered by blists - more mailing lists