[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200402195551.GD9751@quack2.suse.cz>
Date: Thu, 2 Apr 2020 21:55:51 +0200
From: Jan Kara <jack@...e.cz>
To: NeilBrown <neilb@...e.de>
Cc: Trond Myklebust <trondmy@...merspace.com>,
"Anna.Schumaker@...app.com" <Anna.Schumaker@...app.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Jan Kara <jack@...e.cz>, linux-mm@...ck.org,
linux-nfs@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] Deprecate NR_UNSTABLE_NFS, use NR_WRITEBACK
On Thu 02-04-20 10:54:07, NeilBrown wrote:
>
> After an NFS page has been written it is considered "unstable" until a
> COMMIT request succeeds. If the COMMIT fails, the page will be
> re-written.
>
> These "unstable" pages are currently accounted as "reclaimable",
> either in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
> 'reclaimable' count. This might have made sense when sending the COMMIT
> required a separate action by the VFS/MM (e.g. releasepage() used to
> send a COMMIT). However now that all writes generated by ->writepages()
> will automatically be followed by a COMMIT, it makes more sense to
> treat them as writeback pages.
>
> So this page deprecates NR_UNSTABLE_NFS and accounts unstable pages in
> NR_WRITEBACK and WB_WRITEBACK.
>
> A particular effect of this change is that when
> wb_check_background_flush() calls wb_over_bg_threshold(), the latter
> will report 'true' a lot less often as the 'unstable' pages are no
> longer considered 'dirty' (and there is nothing that writeback can do
> about them anyway).
>
> Currently wb_check_background_flush() will trigger writeback to NFS even
> when there are relatively few dirty pages (if there are lots of unstable
> pages), this can result in small writes going to the server (10s of
> Kilobytes rather than a Megabyte) which hurts throughput.
> With this that, there are fewer writes which are each larger on average.
>
> Signed-off-by: NeilBrown <neilb@...e.de>
The patch looks good to me. I agree with Christoph that it would be best to
remove NR_UNSTABLE_NFS completely. We just have to be careful, not change
format of any entries in /proc/ where it currently gets reported...
Honza
> ---
> fs/fs-writeback.c | 1 -
> fs/nfs/internal.h | 7 +++++--
> fs/nfs/write.c | 4 ++--
> include/linux/mmzone.h | 2 +-
> mm/memcontrol.c | 1 -
> mm/page-writeback.c | 7 ++-----
> 6 files changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 76ac9c7d32ec..c5bdf46e3b4b 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1070,7 +1070,6 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
> static unsigned long get_nr_dirty_pages(void)
> {
> return global_node_page_state(NR_FILE_DIRTY) +
> - global_node_page_state(NR_UNSTABLE_NFS) +
> get_nr_dirty_inodes();
> }
>
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index f80c47d5ff27..ba1ff5adeccd 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -660,8 +660,11 @@ void nfs_mark_page_unstable(struct page *page, struct nfs_commit_info *cinfo)
> if (!cinfo->dreq) {
> struct inode *inode = page_file_mapping(page)->host;
>
> - inc_node_page_state(page, NR_UNSTABLE_NFS);
> - inc_wb_stat(&inode_to_bdi(inode)->wb, WB_RECLAIMABLE);
> + /* This page is really still in write-back - just that the
> + * writeback is happening on the server now.
> + */
> + inc_node_page_state(page, NR_WRITEBACK);
> + inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK);
> __mark_inode_dirty(inode, I_DIRTY_DATASYNC);
> }
> }
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index c478b772cc49..2e15a56620b3 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -958,9 +958,9 @@ nfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg,
> static void
> nfs_clear_page_commit(struct page *page)
> {
> - dec_node_page_state(page, NR_UNSTABLE_NFS);
> + dec_node_page_state(page, NR_WRITEBACK);
> dec_wb_stat(&inode_to_bdi(page_file_mapping(page)->host)->wb,
> - WB_RECLAIMABLE);
> + WB_WRITEBACK);
> }
>
> /* Called holding the request lock on @req */
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 462f6873905a..227fcb8cd0e6 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -237,7 +237,7 @@ enum node_stat_item {
> NR_FILE_THPS,
> NR_FILE_PMDMAPPED,
> NR_ANON_THPS,
> - NR_UNSTABLE_NFS, /* NFS unstable pages */
> + NR_UNSTABLE_NFS, /* NFS unstable pages - DEPRECATED DO NOT USE */
> NR_VMSCAN_WRITE,
> NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */
> NR_DIRTIED, /* page dirtyings since bootup */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7ddf91c4295f..fad8e8a23235 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4317,7 +4317,6 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
>
> *pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY);
>
> - /* this should eventually include NR_UNSTABLE_NFS */
> *pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK);
> *pfilepages = memcg_exact_page_state(memcg, NR_INACTIVE_FILE) +
> memcg_exact_page_state(memcg, NR_ACTIVE_FILE);
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 2afb09fa2fe0..d1f03c799d11 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -504,7 +504,6 @@ bool node_dirty_ok(struct pglist_data *pgdat)
> unsigned long nr_pages = 0;
>
> nr_pages += node_page_state(pgdat, NR_FILE_DIRTY);
> - nr_pages += node_page_state(pgdat, NR_UNSTABLE_NFS);
> nr_pages += node_page_state(pgdat, NR_WRITEBACK);
>
> return nr_pages <= limit;
> @@ -1595,8 +1594,7 @@ static void balance_dirty_pages(struct bdi_writeback *wb,
> * written to the server's write cache, but has not yet
> * been flushed to permanent storage.
> */
> - nr_reclaimable = global_node_page_state(NR_FILE_DIRTY) +
> - global_node_page_state(NR_UNSTABLE_NFS);
> + nr_reclaimable = global_node_page_state(NR_FILE_DIRTY);
> gdtc->avail = global_dirtyable_memory();
> gdtc->dirty = nr_reclaimable + global_node_page_state(NR_WRITEBACK);
>
> @@ -1940,8 +1938,7 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb)
> * as we're trying to decide whether to put more under writeback.
> */
> gdtc->avail = global_dirtyable_memory();
> - gdtc->dirty = global_node_page_state(NR_FILE_DIRTY) +
> - global_node_page_state(NR_UNSTABLE_NFS);
> + gdtc->dirty = global_node_page_state(NR_FILE_DIRTY);
> domain_dirty_limits(gdtc);
>
> if (gdtc->dirty > gdtc->bg_thresh)
> --
> 2.26.0
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists