[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRtjfN7sC6_Bv4bx@casper.infradead.org>
Date: Mon, 17 Nov 2025 18:03:40 +0000
From: Matthew Wilcox <willy@...radead.org>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: SHAURYA RANE <ssrane_b23@...vjti.ac.in>, akpm@...ux-foundation.org,
shakeel.butt@...ux.dev, eddyz87@...il.com, andrii@...nel.org,
ast@...nel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-kernel-mentees@...ts.linux.dev,
skhan@...uxfoundation.org, david.hunter.linux@...il.com,
khalid@...nel.org,
syzbot+09b7d050e4806540153d@...kaller.appspotmail.com
Subject: Re: [PATCH] mm/filemap: fix NULL pointer dereference in
do_read_cache_folio()
On Mon, Nov 17, 2025 at 08:41:55AM -0800, Darrick J. Wong wrote:
> I wondered why this whole thing opencodes kernel_read, but then I
> noticed zero fstests for it and decid*******************************
> *****.
I wondered the same thing! And the answer is that it's special BPF
stuff:
/* if sleeping is allowed, wait for the page, if necessary */
if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) {
filemap_invalidate_lock_shared(r->file->f_mapping);
r->folio = read_cache_folio(r->file->f_mapping, file_off >> PAGE_SHIFT,
NULL, r->file);
filemap_invalidate_unlock_shared(r->file->f_mapping);
}
if 'may_fault' (a misnomer since it really means "may sleep"), then we
essentially do kernel_read().
Now, maybe the right thing to do here is rip out almost all of
lib/buildid.c and replace it with an iocb with IOCB_NOWAIT set (or not).
I was hesitant to suggest this earlier as it's a bit of a big ask of
someone who was just trying to submit a one-line change. But now that
"it's also shmem" has entered the picture, I'm leaning more towards this
approach anyway.
Looking at it though, it's a bit weird that we don't have a
kiocb_read(). It feels like __kernel_read() needs to be split into
half like:
diff --git a/fs/read_write.c b/fs/read_write.c
index 833bae068770..a3bf962836a7 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -503,14 +503,29 @@ static int warn_unsupported(struct file *file, const char *op)
return -EINVAL;
}
-ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
+ssize_t kiocb_read(struct kiocb *iocb, void *buf, size_t count)
{
+ struct file *file = iocb->ki_filp;
struct kvec iov = {
.iov_base = buf,
.iov_len = min_t(size_t, count, MAX_RW_COUNT),
};
- struct kiocb kiocb;
struct iov_iter iter;
+ int ret;
+
+ iov_iter_kvec(&iter, ITER_DEST, &iov, 1, iov.iov_len);
+ ret = file->f_op->read_iter(iocb, &iter);
+ if (ret > 0) {
+ fsnotify_access(file);
+ add_rchar(current, ret);
+ }
+ inc_syscr(current);
+ return ret;
+}
+
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
+{
+ struct kiocb kiocb;
ssize_t ret;
if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))
@@ -526,15 +541,9 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
init_sync_kiocb(&kiocb, file);
kiocb.ki_pos = pos ? *pos : 0;
- iov_iter_kvec(&iter, ITER_DEST, &iov, 1, iov.iov_len);
- ret = file->f_op->read_iter(&kiocb, &iter);
- if (ret > 0) {
- if (pos)
- *pos = kiocb.ki_pos;
- fsnotify_access(file);
- add_rchar(current, ret);
- }
- inc_syscr(current);
+ ret = kiocb_read(&kiocb, buf, count);
+ if (pos && ret > 0)
+ *pos = kiocb.ki_pos;
return ret;
}
Powered by blists - more mailing lists