lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Oct 2021 14:08:51 -0400
From:   Jeff Layton <jlayton@...nel.org>
To:     David Howells <dhowells@...hat.com>, linux-cachefs@...hat.com
Cc:     ceph-devel@...r.kernel.org, linux-afs@...ts.infradead.org,
        Anna Schumaker <anna.schumaker@...app.com>,
        linux-nfs@...r.kernel.org,
        Kent Overstreet <kent.overstreet@...il.com>,
        linux-mm@...ck.org, Matthew Wilcox <willy@...radead.org>,
        linux-fsdevel@...r.kernel.org,
        Dave Wysochanski <dwysocha@...hat.com>,
        Marc Dionne <marc.dionne@...istor.com>,
        Trond Myklebust <trond.myklebust@...merspace.com>,
        Shyam Prasad N <nspmangalore@...il.com>,
        Eric Van Hensbergen <ericvh@...il.com>,
        v9fs-developer@...ts.sourceforge.net, linux-cifs@...r.kernel.org,
        Latchesar Ionkov <lucho@...kov.net>,
        Steve French <sfrench@...ba.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Dominique Martinet <asmadeus@...ewreck.org>,
        Ilya Dryomov <idryomov@...il.com>,
        Trond Myklebust <trondmy@...merspace.com>,
        Omar Sandoval <osandov@...ndov.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/67] fscache: Rewrite index API and management system

On Mon, 2021-10-18 at 15:50 +0100, David Howells wrote:
> Here's a set of patches that rewrites and simplifies the fscache index API
> to remove the complex operation scheduling and object state machine in
> favour of something much smaller and simpler.  It is built on top of the
> set of patches that removes the old API[1].
> 
> The operation scheduling API was intended to handle sequencing of cache
> operations, which were all required (where possible) to run asynchronously
> in parallel with the operations being done by the network filesystem, while
> allowing the cache to be brought online and offline and interrupt service
> with invalidation.
> 
> However, with the advent of the tmpfile capacity in the VFS, an opportunity
> arises to do invalidation much more easily, without having to wait for I/O
> that's actually in progress: Cachefiles can simply cut over its file
> pointer for the backing object attached to a cookie and abandon the
> in-progress I/O, dismissing it upon completion.
> 
> Future work there would involve using Omar Sandoval's vfs_link() with
> AT_LINK_REPLACE[2] to allow an extant file to be displaced by a new hard
> link from a tmpfile as currently I have to unlink the old file first.
> 
> These patches can also simplify the object state handling as I/O operations
> to the cache don't all have to be brought to a stop in order to invalidate
> a file.  To that end, and with an eye on to writing a new backing cache
> model in the future, I've taken the opportunity to simplify the indexing
> structure.
> 
> I've separated the index cookie concept from the file cookie concept by
> type now.  The former is now called a "volume cookie" (struct
> fscache_volume) and there is a container of file cookies.  There are then
> just the two levels.  All the index cookieage is collapsed into a single
> volume cookie, and this has a single printable string as a key.  For
> instance, an AFS volume would have a key of something like
> "afs,example.com,1000555", combining the filesystem name, cell name and
> volume ID.  This is freeform, but must not have '/' chars in it.
> 

Given the indexing changes, what sort of behavior should we expect when
upgrading from old-style to new-style indexes? Do they just not match,
and we end up downloading new copies of all the data and the old stale
stuff eventually gets culled?

Ditto for downgrades -- can we expect sane behavior if someone runs an
old kernel on top of an existing fscache that was populated by a new
kernel?

> I've also eliminated all pointers back from fscache into the network
> filesystem.  This required the duplication of a little bit of data in the
> cookie (cookie key, coherency data and file size), but it's not actually
> that much.  This gets rid of problems with making sure we keep netfs data
> structures around so that the cache can access them.
> 
> I have changed afs throughout the patch series, but I also have patches for
> 9p, nfs and cifs.  Jeff Layton is handling ceph support.
> 
> 
> BITS THAT MAY BE CONTROVERSIAL
> ==============================
> 
> There are some bits I've added that may be controversial:
> 
>  (1) I've provided a flag, S_KERNEL_FILE, that cachefiles uses to check if
>      a files is already being used by some other kernel service (e.g. a
>      duplicate cachefiles cache in the same directory) and reject it if it
>      is.  This isn't entirely necessary, but it helps prevent accidental
>      data corruption.
> 
>      I don't want to use S_SWAPFILE as that has other effects, but quite
>      possibly swapon() should set S_KERNEL_FILE too.
> 
>      Note that it doesn't prevent userspace from interfering, though
>      perhaps it should.
> 
>  (2) Cachefiles wants to keep the backing file for a cookie open whilst we
>      might need to write to it from network filesystem writeback.  The
>      problem is that the network filesystem unuses its cookie when its file
>      is closed, and so we have nothing pinning the cachefiles file open and
>      it will get closed automatically after a short time to avoid
>      EMFILE/ENFILE problems.
> 
>      Reopening the cache file, however, is a problem if this is being done
>      due to writeback triggered by exit().  Some filesystems will oops if
>      we try to open a file in that context because they want to access
>      current->fs or suchlike.
> 
>      To get around this, I added the following:
> 
>      (A) An inode flag, I_PINNING_FSCACHE_WB, to be set on a network
>      	 filesystem inode to indicate that we have a usage count on the
>      	 cookie caching that inode.
> 
>      (B) A flag in struct writeback_control, unpinned_fscache_wb, that is
>      	 set when __writeback_single_inode() clears the last dirty page
>      	 from i_pages - at which point it clears I_PINNING_FSCACHE_WB and
>      	 sets this flag.
> 
> 	 This has to be done here so that clearing I_PINNING_FSCACHE_WB can
> 	 be done atomically with the check of PAGECACHE_TAG_DIRTY that
> 	 clears I_DIRTY_PAGES.
> 
>      (C) A function, fscache_set_page_dirty(), which if it is not set, sets
>      	 I_PINNING_FSCACHE_WB and calls fscache_use_cookie() to pin the
>      	 cache resources.
> 
>      (D) A function, fscache_unpin_writeback(), to be called by
>      	 ->write_inode() to unuse the cookie.
> 
>      (E) A function, fscache_clear_inode_writeback(), to be called when the
>      	 inode is evicted, before clear_inode() is called.  This cleans up
>      	 any lingering I_PINNING_FSCACHE_WB.
> 
>      The network filesystem can then use these tools to make sure that
>      fscache_write_to_cache() can write locally modified data to the cache
>      as well as to the server.
> 
>      For the future, I'm working on write helpers for netfs lib that should
>      allow this facility to be removed by keeping track of the dirty
>      regions separately - but that's incomplete at the moment and is also
>      going to be affected by folios, one way or another, since it deals
>      with pages.
> 
> 
> These patches can be found also on:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-rewrite-indexing
> 
> David
> 
> Link: https://lore.kernel.org/r/163363935000.1980952.15279841414072653108.stgit@warthog.procyon.org.uk [1]
> Link: https://lore.kernel.org/r/cover.1580251857.git.osandov@fb.com/ [2]
> 
> ---
> Dave Wysochanski (3):
>       NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie
>       NFS: Convert fscache_enable_cookie and fscache_disable_cookie
>       NFS: Convert fscache invalidation and update aux_data and i_size
> 
> David Howells (63):
>       mm: Stop filemap_read() from grabbing a superfluous page
>       vfs: Provide S_KERNEL_FILE inode flag
>       vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
>       afs: Handle len being extending over page end in write_begin/write_end
>       afs: Fix afs_write_end() to handle len > page size
>       nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite
>       fscache: Remove the netfs data from the cookie
>       fscache: Remove struct fscache_cookie_def
>       fscache: Remove store_limit* from struct fscache_object
>       fscache: Remove fscache_check_consistency()
>       fscache: Remove fscache_attr_changed()
>       fscache: Remove obsolete stats
>       fscache: Remove old I/O tracepoints
>       fscache: Temporarily disable fscache_invalidate()
>       fscache: Disable fscache_begin_operation()
>       fscache: Remove the I/O operation manager
>       fscache: Rename fscache_cookie_{get,put,see}()
>       cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag
>       cachefiles: Don't set an xattr on the root of the cache
>       cachefiles: Remove some redundant checks on unsigned values
>       cachefiles: Prevent inode from going away when burying a dentry
>       cachefiles: Simplify the pathwalk and save the filename for an object
>       cachefiles: trace: Improve the lookup tracepoint
>       cachefiles: Remove separate backer dentry from cachefiles_object
>       cachefiles: Fold fscache_object into cachefiles_object
>       cachefiles: Change to storing file* rather than dentry*
>       cachefiles: trace: Log coherency checks
>       cachefiles: Trace truncations
>       cachefiles: Trace read and write operations
>       cachefiles: Round the cachefile size up to DIO block size
>       cachefiles: Don't use XATTR_ flags with vfs_setxattr()
>       fscache: Replace the object management state machine
>       cachefiles: Trace decisions in cachefiles_prepare_read()
>       cachefiles: Make cachefiles_write_prepare() check for space
>       fscache: Automatically close a file that's been unused for a while
>       fscache: Add stats for the cookie commit LRU
>       fscache: Move fscache_update_cookie() complete inline
>       fscache: Remove more obsolete stats
>       fscache: Note the object size during invalidation
>       vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
>       afs: Render cache cookie key as big endian
>       cachefiles: Use tmpfile/link
>       fscache: Rewrite invalidation
>       cachefiles: Simplify the file lookup/creation/check code
>       fscache: Provide resize operation
>       cachefiles: Put more information in the xattr attached to the cache file
>       fscache: Implement "will_modify" parameter on fscache_use_cookie()
>       fscache: Add support for writing to the cache
>       fscache: Make fscache_clear_page_bits() conditional on cookie
>       fscache: Make fscache_write_to_cache() conditional on cookie
>       afs: Copy local writes to the cache when writing to the server
>       afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr
>       afs: Add O_DIRECT read support
>       afs: Skip truncation on the server of data we haven't written yet
>       afs: Make afs_write_begin() return the THP subpage
>       cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ
>       nfs: Convert to new fscache volume/cookie API
>       9p: Use fscache indexing rewrite and reenable caching
>       9p: Copy local writes to the cache when writing to the server
>       netfs: Display the netfs inode number in the netfs_read tracepoint
>       cachefiles: Add tracepoints to log errors from ops on the backing fs
>       cachefiles: Add error injection support
>       cifs: Support fscache indexing rewrite (untested)
> 
> Jeff Layton (1):
>       fscache: disable cookie when doing an invalidation for DIO write
> 
> 
>  fs/9p/cache.c                     |  184 +----
>  fs/9p/cache.h                     |   23 +-
>  fs/9p/v9fs.c                      |   14 +-
>  fs/9p/v9fs.h                      |   13 +-
>  fs/9p/vfs_addr.c                  |   55 +-
>  fs/9p/vfs_dir.c                   |   13 +-
>  fs/9p/vfs_file.c                  |    7 +-
>  fs/9p/vfs_inode.c                 |   24 +-
>  fs/9p/vfs_inode_dotl.c            |    3 +-
>  fs/9p/vfs_super.c                 |    3 +
>  fs/afs/Makefile                   |    3 -
>  fs/afs/cache.c                    |   68 --
>  fs/afs/cell.c                     |   12 -
>  fs/afs/file.c                     |   83 +-
>  fs/afs/fsclient.c                 |   18 +-
>  fs/afs/inode.c                    |  101 ++-
>  fs/afs/internal.h                 |   36 +-
>  fs/afs/main.c                     |   14 -
>  fs/afs/super.c                    |    1 +
>  fs/afs/volume.c                   |   15 +-
>  fs/afs/write.c                    |  170 +++-
>  fs/afs/yfsclient.c                |   12 +-
>  fs/cachefiles/Kconfig             |    8 +
>  fs/cachefiles/Makefile            |    3 +
>  fs/cachefiles/bind.c              |  186 +++--
>  fs/cachefiles/daemon.c            |   20 +-
>  fs/cachefiles/error_inject.c      |   46 ++
>  fs/cachefiles/interface.c         |  660 +++++++--------
>  fs/cachefiles/internal.h          |  191 +++--
>  fs/cachefiles/io.c                |  310 +++++--
>  fs/cachefiles/key.c               |  203 +++--
>  fs/cachefiles/main.c              |   20 +-
>  fs/cachefiles/namei.c             |  978 ++++++++++------------
>  fs/cachefiles/volume.c            |  128 +++
>  fs/cachefiles/xattr.c             |  367 +++------
>  fs/ceph/Kconfig                   |    2 +-
>  fs/cifs/Makefile                  |    2 +-
>  fs/cifs/cache.c                   |  105 ---
>  fs/cifs/cifsfs.c                  |   11 +-
>  fs/cifs/cifsglob.h                |    5 +-
>  fs/cifs/connect.c                 |    3 -
>  fs/cifs/file.c                    |   37 +-
>  fs/cifs/fscache.c                 |  201 ++---
>  fs/cifs/fscache.h                 |   51 +-
>  fs/cifs/inode.c                   |   18 +-
>  fs/fs-writeback.c                 |    8 +
>  fs/fscache/Kconfig                |    4 +
>  fs/fscache/Makefile               |    6 +-
>  fs/fscache/cache.c                |  541 ++++++-------
>  fs/fscache/cookie.c               | 1262 ++++++++++++++---------------
>  fs/fscache/fsdef.c                |   98 ---
>  fs/fscache/internal.h             |  213 +----
>  fs/fscache/io.c                   |  405 ++++++---
>  fs/fscache/main.c                 |  134 +--
>  fs/fscache/netfs.c                |   74 --
>  fs/fscache/object.c               | 1123 -------------------------
>  fs/fscache/operation.c            |  633 ---------------
>  fs/fscache/page.c                 |   84 --
>  fs/fscache/proc.c                 |   43 +-
>  fs/fscache/stats.c                |  202 ++---
>  fs/fscache/volume.c               |  449 ++++++++++
>  fs/netfs/read_helper.c            |    2 +-
>  fs/nfs/Makefile                   |    2 +-
>  fs/nfs/client.c                   |    4 -
>  fs/nfs/direct.c                   |    2 +
>  fs/nfs/file.c                     |    7 +-
>  fs/nfs/fscache-index.c            |  114 ---
>  fs/nfs/fscache.c                  |  264 ++----
>  fs/nfs/fscache.h                  |   89 +-
>  fs/nfs/inode.c                    |   11 +-
>  fs/nfs/super.c                    |    7 +-
>  fs/nfs/write.c                    |    1 +
>  include/linux/fs.h                |    4 +
>  include/linux/fscache-cache.h     |  463 +++--------
>  include/linux/fscache.h           |  626 +++++++-------
>  include/linux/netfs.h             |    4 +-
>  include/linux/nfs_fs_sb.h         |    9 +-
>  include/linux/writeback.h         |    1 +
>  include/trace/events/cachefiles.h |  483 ++++++++---
>  include/trace/events/fscache.h    |  631 +++++++--------
>  include/trace/events/netfs.h      |    5 +-
>  81 files changed, 5140 insertions(+), 7295 deletions(-)
>  delete mode 100644 fs/afs/cache.c
>  create mode 100644 fs/cachefiles/error_inject.c
>  create mode 100644 fs/cachefiles/volume.c
>  delete mode 100644 fs/cifs/cache.c
>  delete mode 100644 fs/fscache/fsdef.c
>  delete mode 100644 fs/fscache/netfs.c
>  delete mode 100644 fs/fscache/object.c
>  delete mode 100644 fs/fscache/operation.c
>  create mode 100644 fs/fscache/volume.c
>  delete mode 100644 fs/nfs/fscache-index.c
> 
> 

-- 
Jeff Layton <jlayton@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ