[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KRbSeTugZ-tQKrY0BuchJeNjr5c=KZ=4QTr-zk9zWbHQ@mail.gmail.com>
Date: Thu, 14 Sep 2023 20:16:56 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jordan Rife <jrife@...gle.com>
Cc: Paolo Abeni <pabeni@...hat.com>, Daniel Borkmann <daniel@...earbox.net>, davem@...emloft.net,
Eric Dumazet <edumazet@...gle.com>, kuba@...nel.org, netdev@...r.kernel.org,
dborkman@...nel.org
Subject: Re: [PATCH net] net: prevent address overwrite in connect() and sendmsg()
On Thu, Sep 14, 2023 at 2:41 PM Jordan Rife <jrife@...gle.com> wrote:
>
> > 1) Swap out calls to sock->ops->connect() with kernel_connect()
>
> This is trivial, as expected. I have a patch ready that swaps out all
> occurrences of sock->ops->connect().
>
> > 2) Move the address copy to kernel_sendmsg()
> > 3) Swap out calls to sock_sendmsg()/sock->ops->sendmsg() with kernel_sendmsg()
>
> This turns out to be less trivial. kernel_sensmsg() looks to be a
> special case of sock_sendmsg() with sock_sendmsg() being the more
> generic of the two:
>
> int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
> struct kvec *vec, size_t num, size_t size)
> {
> iov_iter_kvec(&msg->msg_iter, ITER_SOURCE, vec, num, size);
> return sock_sendmsg(sock, msg);
> }
>
> It populates msg->msg_iter with a kvec whereas most cases I could find
> where sock_sendmsg() is used are using a bio_vec. Some examples:
>
> ==drivers/iscsi/iscsi_tcp.c: iscsi_sw_tcp_xmit_segment()==
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, copy);
>
> r = sock_sendmsg(sk, &msg);
>
> ==fs/ocfs2/cluster: o2net_sendpage()==
> bvec_set_virt(&bv, virt, size);
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, size);
>
> while (1) {
> msg.msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES;
> mutex_lock(&sc->sc_send_lock);
> ret = sock_sendmsg(sc->sc_sock, &msg);
>
> ==net/sunrpc/svcsock.c: svc_udp_sendto()==
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
> count, 0);
> err = sock_sendmsg(svsk->sk_sock, &msg);
> if (err == -ECONNREFUSED) {
> /* ICMP error on earlier request. */
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
> count, 0);
> err = sock_sendmsg(svsk->sk_sock, &msg);
> }
>
> Maybe these two types are more interchangeable than I'm thinking, but
> it seems like it might be simpler to just do the address copy inside
> sock_sendmsg(). Does this revised plan sound reasonable:
>
> 1) Swap out calls to sock->ops->connect() with kernel_connect()
> 2) Move the address copy to sock_sendmsg()
That also covers normal user code that is already protected.
How about a __kernel_sendmsg that follows the sock_sendmsg API and is
a thin wrapper exactly for in-kernel callers.
> I also noticed that BPF hooks inside bind() can rewrite the bind
> address. Should we do something similar for kernel_bind:
>
> 1) Add an address copy to kernel_bind()
> 2) Swap out direct calls to ops->bind() with kernel_bind()
Good catch. Sounds like it.
At that point it might be worth splitting into three patches.
> -Jordan
>
> On Thu, Sep 14, 2023 at 1:24 AM Paolo Abeni <pabeni@...hat.com> wrote:
> >
> > On Wed, 2023-09-13 at 10:02 -0400, Willem de Bruijn wrote:
> > > On Tue, Sep 12, 2023 at 5:09 PM Jordan Rife <jrife@...gle.com> wrote:
> > > >
> > > > > If we take this path, it could be a single patch. The subsystem
> > > > > maintainers should be CC:ed so that they can (N)ACK it.
> > > > >
> > > > > But I do not mean to ask to split it up and test each one separately.
> > > > >
> > > > > The change from sock->ops->connect to kernel_connect is certainly
> > > > > trivial enough that compile testing should suffice.
> > > >
> > > > Ack. Thanks for clarifying.
> > > >
> > > > > The only question is whether we should pursue your original patch and
> > > > > accept that this will continue, or one that improves the situation,
> > > > > but touches more files and thus has a higher risk of merge conflicts.
> > > > >
> > > > > I'd like to give others some time to chime in. I've given my opinion,
> > > > > but it's only one.
> > > > >
> > > > > I'd like to give others some time to chime in. I've given my opinion,
> > > > > but it's only one.
> > > >
> > > > Sounds good. I'll wait to hear others' opinions on the best path forward.
> > >
> > > No other comments so far.
> > >
> > > My hunch is that a short list of these changes
> > >
> > > ```
> > > @@ -1328,7 +1328,7 @@ static int kernel_bindconnect(struct socket *s,
> > > struct sockaddr *laddr,
> > > if (rv < 0)
> > > return rv;
> > >
> > > - rv = s->ops->connect(s, raddr, size, flags);
> > > + rv = kernel_connect(s, raddr, size, flags);
> > > ```
> > >
> > > is no more invasive than your proposed patch, and gives a more robust outcome.
> > >
> > > Please take a stab.
> >
> > I'm sorry for the late feedback. For the records, I agree the cleanest
> > fix described above should be attempted first.
> >
> > Thanks,
> >
> > Paolo
> >
Powered by blists - more mailing lists