[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFX2Jfn8jER-aV_ttiAe1tkh8f+m=5-whEBTWbHO1uVwf=B4bw@mail.gmail.com>
Date: Fri, 17 Dec 2021 16:29:04 -0500
From: Anna Schumaker <anna.schumaker@...app.com>
To: NeilBrown <neilb@...e.de>
Cc: Trond Myklebust <trond.myklebust@...merspace.com>,
Chuck Lever <chuck.lever@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>,
Christoph Hellwig <hch@...radead.org>,
David Howells <dhowells@...hat.com>,
Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
linux-mm@...ck.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 00/18 V2] Repair SWAP-over-NFS
Hi Neil,
On Thu, Dec 16, 2021 at 7:07 PM NeilBrown <neilb@...e.de> wrote:
>
> swap-over-NFS currently has a variety of problems.
>
> swap writes call generic_write_checks(), which always fails on a swap
> file, so it completely fails.
> Even without this, various deadlocks are possible - largely due to
> improvements in NFS memory allocation (using NOFS instead of ATOMIC)
> which weren't tested against swap-out.
>
> NFS is the only filesystem that has supported fs-based swap IO, and it
> hasn't worked for several releases, so now is a convenient time to clean
> up the swap-via-filesystem interfaces - we cannot break anything !
>
> So the first few patches here clean up and improve various parts of the
> swap-via-filesystem code. ->activate_swap() is given a cleaner
> interface, a new ->swap_rw is introduced instead of burdening
> ->direct_IO, etc.
>
> Current swap-to-filesystem code only ever submits single-page reads and
> writes. These patches change that to allow multi-page IO when adjacent
> requests are submitted. Writes are also changed to be async rather than
> sync. This substantially speeds up write throughput for swap-over-NFS.
>
> Some of the NFS patches can land independently of the MM patches. A few
> require the MM patches to land first.
Thanks for fixing swap-over-NFS! Looks like it passes all the
swap-related xfstests except for generic/357 on NFS v4.2. This test
checks that we get -EINVAL on a reflinked swapfile, but I'm not sure
if there is a way to check for that on the client side but if you have
any ideas it would be nice to get that test passing while you're at
it!
Anna
>
> Thanks,
> NeilBrown
>
>
> ---
>
> NeilBrown (18):
> Structural cleanup for filesystem-based swap
> MM: create new mm/swap.h header file.
> MM: use ->swap_rw for reads from SWP_FS_OPS swap-space
> MM: perform async writes to SWP_FS_OPS swap-space
> MM: reclaim mustn't enter FS for SWP_FS_OPS swap-space
> MM: submit multipage reads for SWP_FS_OPS swap-space
> MM: submit multipage write for SWP_FS_OPS swap-space
> MM: Add AS_CAN_DIO mapping flag
> NFS: rename nfs_direct_IO and use as ->swap_rw
> NFS: swap IO handling is slightly different for O_DIRECT IO
> SUNRPC/call_alloc: async tasks mustn't block waiting for memory
> SUNRPC/auth: async tasks mustn't block waiting for memory
> SUNRPC/xprt: async tasks mustn't block waiting for memory
> SUNRPC: remove scheduling boost for "SWAPPER" tasks.
> NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS
> SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC
> NFSv4: keep state manager thread active if swap is enabled
> NFS: swap-out must always use STABLE writes.
>
>
> drivers/block/loop.c | 4 +-
> fs/fcntl.c | 5 +-
> fs/inode.c | 3 +
> fs/nfs/direct.c | 56 ++++++----
> fs/nfs/file.c | 25 +++--
> fs/nfs/inode.c | 1 +
> fs/nfs/nfs4_fs.h | 1 +
> fs/nfs/nfs4proc.c | 20 ++++
> fs/nfs/nfs4state.c | 39 ++++++-
> fs/nfs/read.c | 4 -
> fs/nfs/write.c | 2 +
> fs/open.c | 2 +-
> fs/overlayfs/file.c | 10 +-
> include/linux/fs.h | 2 +-
> include/linux/nfs_fs.h | 11 +-
> include/linux/nfs_xdr.h | 2 +
> include/linux/pagemap.h | 3 +-
> include/linux/sunrpc/auth.h | 1 +
> include/linux/sunrpc/sched.h | 1 -
> include/linux/swap.h | 121 --------------------
> include/linux/writeback.h | 7 ++
> include/trace/events/sunrpc.h | 1 -
> mm/madvise.c | 9 +-
> mm/memory.c | 3 +-
> mm/mincore.c | 1 +
> mm/page_alloc.c | 1 +
> mm/page_io.c | 189 ++++++++++++++++++++++++++------
> mm/shmem.c | 1 +
> mm/swap.h | 140 +++++++++++++++++++++++
> mm/swap_state.c | 32 ++++--
> mm/swapfile.c | 6 +
> mm/util.c | 1 +
> mm/vmscan.c | 31 +++++-
> net/sunrpc/auth.c | 8 +-
> net/sunrpc/auth_gss/auth_gss.c | 6 +-
> net/sunrpc/auth_unix.c | 10 +-
> net/sunrpc/clnt.c | 7 +-
> net/sunrpc/sched.c | 29 +++--
> net/sunrpc/xprt.c | 19 ++--
> net/sunrpc/xprtrdma/transport.c | 10 +-
> net/sunrpc/xprtsock.c | 8 ++
> 41 files changed, 558 insertions(+), 274 deletions(-)
> create mode 100644 mm/swap.h
>
> --
> Signature
>
Powered by blists - more mailing lists