[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <163492947519.1038219.10011806240425457489.stgit@warthog.procyon.org.uk>
Date: Fri, 22 Oct 2021 20:04:35 +0100
From: David Howells <dhowells@...hat.com>
To: linux-cachefs@...hat.com
Cc: dhowells@...hat.com, Trond Myklebust <trondmy@...merspace.com>,
Anna Schumaker <anna.schumaker@...app.com>,
Steve French <sfrench@...ba.org>,
Dominique Martinet <asmadeus@...ewreck.org>,
Jeff Layton <jlayton@...nel.com>,
Matthew Wilcox <willy@...radead.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Omar Sandoval <osandov@...ndov.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-afs@...ts.infradead.org, linux-nfs@...r.kernel.org,
linux-cifs@...r.kernel.org, ceph-devel@...r.kernel.org,
v9fs-developer@...ts.sourceforge.net,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH v2 24/53] fscache: Implement I/O interface
Provide three functions to perform I/O to the cache and a helper function
for use with them.
The first function allows an arbitrary asynchronous direct-IO read to be
made against a cache object, though the read should be aligned and sized
appropriately for the backing device:
int fscache_read(struct netfs_cache_resources *cres,
loff_t start_pos,
struct iov_iter *iter,
enum netfs_read_from_hole read_hole,
netfs_io_terminated_t term_func,
void *term_func_priv);
The cache resources must have been previously initialised by
fscache_begin_read_operation(). A read operation is sent to the backing
filesystem, starting at start_pos within the file. The size of the read is
specified by the iterator, as is the location of the output buffer.
If there is a hole in the data it can be ignored (NETFS_READ_HOLE_IGNORE)
and the data padded with zeros, the area of the read can be excised
(NETFS_READ_HOLE_CLEAR) or -ENODATA can be given (NETFS_READ_HOLE_FAIL).
If term_func is given, the operation may be done asynchronously; in such a
case (*term_func)() will be called upon completion, successful or
otherwise, of the I/O operation and passed term_func_priv. If the op is
asynchronous and doesn't complete before this returns, -EIOCBQUEUED will be
returned.
The second function allows an arbitrary asynchronous direct-IO write to be
made against a cache object, though the write should be aligned and sized
appropriately for the backing device:
int fscache_write(struct netfs_cache_resources *cres,
loff_t start_pos,
struct iov_iter *iter,
netfs_io_terminated_t term_func,
void *term_func_priv);
This works in very similar way to fscache_read(), except that there's no
need to deal with holes (they're just overwritten).
The third function is a bit higher level than that and allows a write to be
made from the pagecache of an inode:
void fscache_write_to_cache(struct fscache_cookie *cookie,
struct address_space *mapping,
loff_t start,
size_t len,
loff_t i_size,
netfs_io_terminated_t term_func,
void *term_func_priv);
If cookie is NULL, this function does nothing except call (*term_func)() if
given. It assumes that, in such a case, PG_fscache will not have been set
on the pages.
Otherwise, this function requires the pages to be written from to have
PG_fscache set on them before it is called. start and len define the
region of the file to be modified and i_size indicates the new file size.
The source data is expected to be on pages attached to mapping.
term_func and term_func_priv work as for fscache_write(). The PG_fscache
bits will be set at the end of the operation, before term_func is called or
the function otherwise returns.
There is an additonal helper function to clear the PG_fscache bits from a
range of pages:
void fscache_clear_page_bits(struct fscache_cookie *cookie,
struct address_space *mapping,
loff_t start, size_t len);
The pages to be modified are expected to be located on mapping in the range
defined by start and len. If cookie is NULL, it does nothing.
Signed-off-by: David Howells <dhowells@...hat.com>
cc: linux-cachefs@...hat.com
---
fs/fscache/io.c | 100 ++++++++++++++++++++++++++++++++
include/linux/fscache.h | 124 ++++++++++++++++++++++++++++++++++++++++
include/trace/events/fscache.h | 2 +
3 files changed, 226 insertions(+)
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 9c747069b923..0c74dbb91fea 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -150,3 +150,103 @@ int __fscache_begin_read_operation(struct netfs_cache_resources *cres,
fscache_access_io_read);
}
EXPORT_SYMBOL(__fscache_begin_read_operation);
+
+struct fscache_write_request {
+ struct netfs_cache_resources cache_resources;
+ struct address_space *mapping;
+ loff_t start;
+ size_t len;
+ netfs_io_terminated_t term_func;
+ void *term_func_priv;
+};
+
+void __fscache_clear_page_bits(struct address_space *mapping,
+ loff_t start, size_t len)
+{
+ pgoff_t first = start / PAGE_SIZE;
+ pgoff_t last = (start + len - 1) / PAGE_SIZE;
+ struct page *page;
+
+ if (len) {
+ XA_STATE(xas, &mapping->i_pages, first);
+
+ rcu_read_lock();
+ xas_for_each(&xas, page, last) {
+ end_page_fscache(page);
+ }
+ rcu_read_unlock();
+ }
+}
+EXPORT_SYMBOL(__fscache_clear_page_bits);
+
+/*
+ * Deal with the completion of writing the data to the cache.
+ */
+static void fscache_wreq_done(void *priv, ssize_t transferred_or_error,
+ bool was_async)
+{
+ struct fscache_write_request *wreq = priv;
+
+ fscache_clear_page_bits(fscache_cres_cookie(&wreq->cache_resources),
+ wreq->mapping, wreq->start, wreq->len);
+
+ if (wreq->term_func)
+ wreq->term_func(wreq->term_func_priv, transferred_or_error,
+ was_async);
+ fscache_end_operation(&wreq->cache_resources);
+ kfree(wreq);
+}
+
+void __fscache_write_to_cache(struct fscache_cookie *cookie,
+ struct address_space *mapping,
+ loff_t start, size_t len, loff_t i_size,
+ netfs_io_terminated_t term_func,
+ void *term_func_priv)
+{
+ struct fscache_write_request *wreq;
+ struct netfs_cache_resources *cres;
+ struct iov_iter iter;
+ int ret = -ENOBUFS;
+
+ if (!fscache_cookie_valid(cookie) || len == 0)
+ goto abandon;
+
+ _enter("%llx,%zx", start, len);
+
+ wreq = kzalloc(sizeof(struct fscache_write_request), GFP_NOFS);
+ if (!wreq)
+ goto abandon;
+ wreq->mapping = mapping;
+ wreq->start = start;
+ wreq->len = len;
+ wreq->term_func = term_func;
+ wreq->term_func_priv = term_func_priv;
+
+ cres = &wreq->cache_resources;
+ if (fscache_begin_operation(cres, cookie, FSCACHE_WANT_WRITE,
+ fscache_access_io_write) < 0)
+ goto abandon_free;
+
+ ret = cres->ops->prepare_write(cres, &start, &len, i_size, false);
+ if (ret < 0)
+ goto abandon_end;
+
+ /* TODO: Consider clearing page bits now for space the write isn't
+ * covering. This is more complicated than it appears when THPs are
+ * taken into account.
+ */
+
+ iov_iter_xarray(&iter, WRITE, &mapping->i_pages, start, len);
+ fscache_write(cres, start, &iter, fscache_wreq_done, wreq);
+ return;
+
+abandon_end:
+ return fscache_wreq_done(wreq, ret, false);
+abandon_free:
+ kfree(wreq);
+abandon:
+ fscache_clear_page_bits(cookie, mapping, start, len);
+ if (term_func)
+ term_func(term_func_priv, ret, false);
+}
+EXPORT_SYMBOL(__fscache_write_to_cache);
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index f24604f0f818..b3b625d0834c 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -172,6 +172,10 @@ extern void __fscache_invalidate(struct fscache_cookie *, const void *, loff_t,
extern int __fscache_begin_read_operation(struct netfs_cache_resources *, struct fscache_cookie *);
#endif
+extern void __fscache_write_to_cache(struct fscache_cookie *, struct address_space *,
+ loff_t, size_t, loff_t, netfs_io_terminated_t, void *);
+extern void __fscache_clear_page_bits(struct address_space *, loff_t, size_t);
+
/**
* fscache_acquire_volume - Register a volume as desiring caching services
* @volume_key: An identification string for the volume
@@ -429,6 +433,126 @@ int fscache_begin_read_operation(struct netfs_cache_resources *cres,
return -ENOBUFS;
}
+/**
+ * fscache_read - Start a read from the cache.
+ * @cres: The cache resources to use
+ * @start_pos: The beginning file offset in the cache file
+ * @iter: The buffer to fill - and also the length
+ * @read_hole: How to handle a hole in the data.
+ * @term_func: The function to call upon completion
+ * @term_func_priv: The private data for @term_func
+ *
+ * Start a read from the cache. @cres indicates the cache object to read from
+ * and must be obtained by a call to fscache_begin_operation() beforehand.
+ *
+ * The data is read into the iterator, @iter, and that also indicates the size
+ * of the operation. @start_pos is the start position in the file, though if
+ * @seek_data is set appropriately, the cache can use SEEK_DATA to find the
+ * next piece of data, writing zeros for the hole into the iterator.
+ *
+ * Upon termination of the operation, @term_func will be called and supplied
+ * with @term_func_priv plus the amount of data written, if successful, or the
+ * error code otherwise.
+ */
+static inline
+int fscache_read(struct netfs_cache_resources *cres,
+ loff_t start_pos,
+ struct iov_iter *iter,
+ enum netfs_read_from_hole read_hole,
+ netfs_io_terminated_t term_func,
+ void *term_func_priv)
+{
+ const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
+ return ops->read(cres, start_pos, iter, read_hole,
+ term_func, term_func_priv);
+}
+
+/**
+ * fscache_write - Start a write to the cache.
+ * @cres: The cache resources to use
+ * @start_pos: The beginning file offset in the cache file
+ * @iter: The data to write - and also the length
+ * @term_func: The function to call upon completion
+ * @term_func_priv: The private data for @term_func
+ *
+ * Start a write to the cache. @cres indicates the cache object to write to and
+ * must be obtained by a call to fscache_begin_operation() beforehand.
+ *
+ * The data to be written is obtained from the iterator, @iter, and that also
+ * indicates the size of the operation. @start_pos is the start position in
+ * the file.
+ *
+ * Upon termination of the operation, @term_func will be called and supplied
+ * with @term_func_priv plus the amount of data written, if successful, or the
+ * error code otherwise.
+ */
+static inline
+int fscache_write(struct netfs_cache_resources *cres,
+ loff_t start_pos,
+ struct iov_iter *iter,
+ netfs_io_terminated_t term_func,
+ void *term_func_priv)
+{
+ const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
+ return ops->write(cres, start_pos, iter, term_func, term_func_priv);
+}
+
+/**
+ * fscache_clear_page_bits - Clear the PG_fscache bits from a set of pages
+ * @cookie: The cookie representing the cache object
+ * @mapping: The netfs inode to use as the source
+ * @start: The start position in @mapping
+ * @len: The amount of data to unlock
+ *
+ * Clear the PG_fscache flag from a sequence of pages and wake up anyone who's
+ * waiting.
+ */
+static inline void fscache_clear_page_bits(struct fscache_cookie *cookie,
+ struct address_space *mapping,
+ loff_t start, size_t len)
+{
+ if (fscache_cookie_valid(cookie))
+ __fscache_clear_page_bits(mapping, start, len);
+}
+
+/**
+ * fscache_write_to_cache - Save a write to the cache and clear PG_fscache
+ * @cookie: The cookie representing the cache object
+ * @mapping: The netfs inode to use as the source
+ * @start: The start position in @mapping
+ * @len: The amount of data to write back
+ * @i_size: The new size of the inode
+ * @term_func: The function to call upon completion
+ * @term_func_priv: The private data for @term_func
+ *
+ * Helper function for a netfs to write dirty data from an inode into the cache
+ * object that's backing it.
+ *
+ * @start and @len describe the range of the data. This does not need to be
+ * page-aligned, but to satisfy DIO requirements, the cache may expand it up to
+ * the page boundaries on either end. All the pages covering the range must be
+ * marked with PG_fscache.
+ *
+ * If given, @term_func will be called upon completion and supplied with
+ * @term_func_priv. Note that the PG_fscache flags will have been cleared by
+ * this point, so the netfs must retain its own pin on the mapping.
+ */
+static inline void fscache_write_to_cache(struct fscache_cookie *cookie,
+ struct address_space *mapping,
+ loff_t start, size_t len, loff_t i_size,
+ netfs_io_terminated_t term_func,
+ void *term_func_priv)
+{
+ if (fscache_cookie_valid(cookie)) {
+ __fscache_write_to_cache(cookie, mapping, start, len, i_size,
+ term_func, term_func_priv);
+ } else {
+ fscache_clear_page_bits(cookie, mapping, start, len);
+ if (term_func)
+ term_func(term_func_priv, -ENOBUFS, false);
+ }
+
+}
#endif /* FSCACHE_USE_NEW_IO_API */
#endif /* _LINUX_FSCACHE_H */
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 4ccaf4490b1b..63820b807494 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -78,6 +78,7 @@ enum fscache_access_trace {
fscache_access_io_not_live,
fscache_access_io_read,
fscache_access_io_wait,
+ fscache_access_io_write,
fscache_access_lookup_cookie,
fscache_access_lookup_cookie_end,
fscache_access_relinquish_volume,
@@ -146,6 +147,7 @@ enum fscache_access_trace {
EM(fscache_access_io_not_live, "END io_notl") \
EM(fscache_access_io_read, "BEGIN io_read") \
EM(fscache_access_io_wait, "WAIT io ") \
+ EM(fscache_access_io_write, "BEGIN io_writ") \
EM(fscache_access_lookup_cookie, "BEGIN lookup ") \
EM(fscache_access_lookup_cookie_end, "END lookup ") \
EM(fscache_access_relinquish_volume, "BEGIN rlq_vol") \
Powered by blists - more mailing lists