[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOi1vP9y7-nMye8u82+O-FxoAPecbecasfY0=yH3TvQYYyCEtA@mail.gmail.com>
Date: Tue, 27 Jun 2023 15:25:59 +0200
From: Ilya Dryomov <idryomov@...il.com>
To: David Howells <dhowells@...hat.com>
Cc: netdev@...r.kernel.org, Xiubo Li <xiubli@...hat.com>,
Jeff Layton <jlayton@...nel.org>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Jens Axboe <axboe@...nel.dk>, Matthew Wilcox <willy@...radead.org>, ceph-devel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v2] libceph: Partially revert changes to support MSG_SPLICE_PAGES
On Mon, Jun 26, 2023 at 11:05 PM David Howells <dhowells@...hat.com> wrote:
>
> Fix the mishandling of MSG_DONTWAIT and also reinstates the per-page
> checking of the source pages (which might have come from a DIO write by
> userspace) by partially reverting the changes to support MSG_SPLICE_PAGES
> and doing things a little differently. In messenger_v1:
>
> (1) The ceph_tcp_sendpage() is resurrected and the callers reverted to use
> that.
>
> (2) The callers now pass MSG_MORE unconditionally. Previously, they were
> passing in MSG_MORE|MSG_SENDPAGE_NOTLAST and then degrading that to
> just MSG_MORE on the last call to ->sendpage().
>
> (3) Make ceph_tcp_sendpage() a wrapper around sendmsg() rather than
> sendpage(), setting MSG_SPLICE_PAGES if sendpage_ok() returns true on
> the page.
>
> In messenger_v2:
>
> (4) Bring back do_try_sendpage() and make the callers use that.
>
> (5) Make do_try_sendpage() use sendmsg() for both cases and set
> MSG_SPLICE_PAGES if sendpage_ok() is set.
>
> Fixes: 40a8c17aa770 ("ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage")
> Fixes: fa094ccae1e7 ("ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()")
> Reported-by: Ilya Dryomov <idryomov@...il.com>
> Link: https://lore.kernel.org/r/CAOi1vP9vjLfk3W+AJFeexC93jqPaPUn2dD_4NrzxwoZTbYfOnw@mail.gmail.com/
> Link: https://lore.kernel.org/r/CAOi1vP_Bn918j24S94MuGyn+Gxk212btw7yWeDrRcW1U8pc_BA@mail.gmail.com/
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Ilya Dryomov <idryomov@...il.com>
> cc: Xiubo Li <xiubli@...hat.com>
> cc: Jeff Layton <jlayton@...nel.org>
> cc: "David S. Miller" <davem@...emloft.net>
> cc: Eric Dumazet <edumazet@...gle.com>
> cc: Jakub Kicinski <kuba@...nel.org>
> cc: Paolo Abeni <pabeni@...hat.com>
> cc: Jens Axboe <axboe@...nel.dk>
> cc: Matthew Wilcox <willy@...radead.org>
> cc: ceph-devel@...r.kernel.org
> cc: netdev@...r.kernel.org
> Link: https://lore.kernel.org/r/3101881.1687801973@warthog.procyon.org.uk/ # v1
> ---
> Notes:
> ver #2)
> - Removed mention of MSG_SENDPAGE_NOTLAST in comments.
> - Changed some refs to sendpage to MSG_SPLICE_PAGES in comments.
> - Init msg_iter in ceph_tcp_sendpage().
> - Move setting of MSG_SPLICE_PAGES in do_try_sendpage() next to comment
> and adjust how it is cleared.
>
> net/ceph/messenger_v1.c | 58 ++++++++++++++++++++-----------
> net/ceph/messenger_v2.c | 88 ++++++++++++++++++++++++++++++++++++++----------
> 2 files changed, 107 insertions(+), 39 deletions(-)
>
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index 814579f27f04..51a6f28aa798 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -74,6 +74,39 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov,
> return r;
> }
>
> +/*
> + * @more: MSG_MORE or 0.
> + */
> +static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
> + int offset, size_t size, int more)
> +{
> + struct msghdr msg = {
> + .msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL | more,
> + };
> + struct bio_vec bvec;
> + int ret;
> +
> + /*
> + * MSG_SPLICE_PAGES cannot properly handle pages with page_count == 0,
> + * we need to fall back to sendmsg if that's the case.
> + *
> + * Same goes for slab pages: skb_can_coalesce() allows
> + * coalescing neighboring slab objects into a single frag which
> + * triggers one of hardened usercopy checks.
> + */
> + if (sendpage_ok(page))
> + msg.msg_flags |= MSG_SPLICE_PAGES;
> +
> + bvec_set_page(&bvec, page, size, offset);
> + iov_iter_bvec(&msg.msg_iter, ITER_DEST, &bvec, 1, size);
Hi David,
Shouldn't this be ITER_SOURCE?
Thanks,
Ilya
Powered by blists - more mailing lists