[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aT-59HURCGPDUJnZ@codewreck.org>
Date: Mon, 15 Dec 2025 16:34:12 +0900
From: Dominique Martinet <asmadeus@...ewreck.org>
To: Christoph Hellwig <hch@...radead.org>
Cc: Eric Van Hensbergen <ericvh@...nel.org>,
Latchesar Ionkov <lucho@...kov.net>,
Christian Schoenebeck <linux_oss@...debyte.com>,
v9fs@...ts.linux.dev, linux-kernel@...r.kernel.org,
David Howells <dhowells@...hat.com>,
Matthew Wilcox <willy@...radead.org>, linux-fsdevel@...r.kernel.org,
Chris Arges <carges@...udflare.com>
Subject: Re: [PATCH] 9p/virtio: restrict page pinning to user_backed_iter()
iovec
Thanks for having a look
Christoph Hellwig wrote on Sun, Dec 14, 2025 at 09:55:12PM -0800:
> > Ok, I don't understand why the current code locks everything down and
> > wants to use a single scatterlist shared for the whole channel (and
> > capped to 128 pages?), it should only need to lock around the
> > virtqueue_add_sg() call, I'll need to play with that some more.
>
> What do you mean with "lock down"?
Just the odd (to me) use of the chan->lock around basically all of
p9_virtio_request() and most of p9_virtio_zc_request() -- I'm not pretty
sure this was just the author trying to avoid an allocation by recycling
the chan->sg array around though, so ignore this.
> > Looking at other virtio drivers I could probably use a sg_table and
> > have extract_iter_to_sg() do all the work for us...
>
> Looking at the code I'm actually really confused. Both because I
> actually though we were talking about the 9fs direct I/O code, but
> that has actually been removed / converted to netfs a long time ago.
>
> But even more so what the net/9p code is actually doing.. How do
> we even end up with user addresses here at all?
FWIW I tried logging and saw ITER_BVEC, ITER_KVEC and ITER_FOLIOQ --
O_DIRECT writes are seen as BVEC so I guess it's not as direct as I
expected them to be -- that code could very well be leftovers from
the switch to iov_iter back in 2015...
(I'm actually not sure why Christian suggested checking for is_iovec()
in https://lkml.kernel.org/r/2245723.irdbgypaU6@weasel -- then I
generalized it to user_backed_iter() and it just worked because checking
for that moved out bvec and folioq from iov_iter_get_pages_alloc2()
to... something that obviously should not work in my opinion but
apparently was enough to not trigger this particular BUG.)
> Let me try to understand things:
>
> - p9_virtio_zc_request is the only instances of the p9_trans_module
> zc_request operation.
> - zc_request only gets called by p9_client_zc_rpc
> - p9_client_zc_rpc gets called by p9_client_read_once, p9_client_write,
> p9_client_write_subreq and p9_client_readdir
>
> Let's go through these:
>
> - p9_client_write_subreq is entirely unused
Let's remove that.. I'll send a patch later.
> - p9_client_readdir builds a local iov_iter_kvec
> - p9_client_read_once is only called by p9_client_read, and really
> should be marked static.
agreed, will cleanup too.
> - p9_client_read is called by v9fs_issue_read on a netfs iov_iter
> and by v9fs_dir_readdir and v9fs_fid_xattr_get on a local kvec iter
> - p9_client_write is called with a iov_iter_kvec from
> v9fs_fid_xattr_set, and with a netfs-issued iov_iter by
> v9fs_issue_write
>
> So right now except for netfs everything is on a kvec. Dave, what
> kind of iov_iter does netfs send down to the file system? I had
> a bit of a hard time reading through it, but I'd expect that any
> page pinning would be done in netfs and not below it? Why are we
> using iov_iters here and not something like a bio_vec? What is
> the fs / transport supported to do with these iters?
>
> Ignoring the rest of the mail for now, because I suspect the outcome
> of the above might make it irrelevant, but I'll come back to it if
> needed.
(waiting for David's answer here, but as far as I see the contract
between the transport and the vfs is that the transport should handle
whatever it's being fed, so it doesn't really matter if it's a bio_vec
or an iov_iter -- ultimately virtio or whatever backend that wants to
handle zc likely won't handle bio_vec any better so it'll need
converting anyway)
Thanks,
--
Dominique Martinet | Asmadeus
Powered by blists - more mailing lists