[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-JW+Gs+EeJk2jknU6ZL0prjRO41Q3EpVTOTpTD8sEOh6A@mail.gmail.com>
Date: Tue, 12 Sep 2023 16:48:25 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jordan Rife <jrife@...gle.com>
Cc: Daniel Borkmann <daniel@...earbox.net>, davem@...emloft.net,
Eric Dumazet <edumazet@...gle.com>, kuba@...nel.org, pabeni@...hat.com,
netdev@...r.kernel.org, dborkman@...nel.org
Subject: Re: [PATCH net] net: prevent address overwrite in connect() and sendmsg()
On Tue, Sep 12, 2023 at 4:07 PM Jordan Rife <jrife@...gle.com> wrote:
>
> > Ideally these all would use the proper kernel-internal APIs.
> > I care less about new code. If all examples are clear, that will do the right thing, or is a simple fix-up worst case.
>
> Fair enough.
>
> > The changes do seem like trivial one liners?
>
> Looks like it. Still, it's going to take some time to patch+test each
> of these. I'll start with NFS (SUNRPC), SMB, and Ceph, since I can
> easily test them.
>
> > Question is if changing all these callers is suitable for a patch targeting net.
> One patch to net will be needed to add an address copy to
> kernel_sendmsg(), but I guess I'll need to send some other patches to
> the appropriate subsystem maintainers for SUNRPC, SMB, and Ceph to
> swap out calls.
If we take this path, it could be a single patch. The subsystem
maintainers should be CC:ed so that they can (N)ACK it.
But I do not mean to ask to split it up and test each one separately.
The change from sock->ops->connect to kernel_connect is certainly
trivial enough that compile testing should suffice.
Note that kernel_connect actually uses READ_ONCE(sock->ops) because of
a data race. All callers that call a socket that may be subject to
IPV6_ADDRFORM should do that. Likely some of those open coded examples
are affected, and do not. This is another example why using a single
interface is preferable.
> > Similarly, this could call kernel_sendmsg, and the extra copy handled in that wrapper.
>
> Would it make more sense to do the address copy in sock_sendmsg()
> instead? Are kernel callers "supposed" to use kernel_sendmsg() or is
> it valid for them to use sock_sendmsg() as many of them seem to want
> to do today. Seeing as sock_sendmsg() is called by kernel_sendmsg(),
> adding the copy in sock_sendmsg() fixes this problem for callers to
> both of these and avoids the need for patching all modules that call
> sock_sendmsg().
I think it is "supposed" to be the API for these cases. But as you
show, clearly it isn't today. Practice trumps theory.
The only question is whether we should pursue your original patch and
accept that this will continue, or one that improves the situation,
but touches more files and thus has a higher risk of merge conflicts.
I'd like to give others some time to chime in. I've given my opinion,
but it's only one.
> -Jordan
>
> On Tue, Sep 12, 2023 at 12:36 PM Willem de Bruijn
> <willemdebruijn.kernel@...il.com> wrote:
> >
> > On Tue, Sep 12, 2023 at 2:31 PM Jordan Rife <jrife@...gle.com> wrote:
> > >
> > > > These should probably call kernel_connect instead.
> > > > Similarly, this could call kernel_sendmsg, and the extra copy handled in that wrapper.
> > >
> > > I was considering this approach initially, but one concern I had was
> > > that there are other instances I see of modules calling ops->connect()
> > > directly (bypassing kernel_connect()):
> > >
> > > - net/netfilter/ipvs/ip_vs_sync.c: make_send_sock()
> > > - drivers/block/drbd/drbd_receiver.c: drbd_try_connect()
> > > - drivers/infiniband/hw/erdma/erdma_cm.c: kernel_bindconnect()
> > > - drivers/infiniband/sw/siw/siw_cm.c: kernel_bindconnect()
> > > - fs/ocfs2/cluster/tcp.c: o2net_start_connect()
> > > - net/rds/tcp_connect.c: rds_tcp_conn_path_connect()
> > >
> > > and ops->sendmsg():
> > >
> > > - net/smc/af_smc.c: smc_sendmsg()
> > > - drivers/vhost/net.c: vhost_tx_batch(), handle_tx_copy(), handle_tx_zerocopy()
> > > - drivers/target/iscsi/iscsi_target_util.c: iscsit_fe_sendpage_sg()
> > >
> > > which (at least in theory) leaves them open to the same problem I'm
> > > seeing with NFS/SMB right now. I worry that even if all these
> > > instances were swapped out with kernel_sendmsg() and kernel_connect(),
> > > it would turn into a game of whac-a-mole in the future as new changes
> > > or new modules may reintroduce direct calls to sock->ops->connect() or
> > > sock->ops->sendmsg().
> >
> > Ideally these all would use the proper kernel-internal APIs.
> >
> > I care less about new code. If all examples are clear, that
> > will do the right thing, or is a simple fix-up worst case.
> >
> > Question is if changing all these callers is suitable for a patch
> > targeting net.
> >
> > The changes do seem like trivial one liners?
> >
> > > -Jordan
> > >
> > >
> > >
> > > On Tue, Sep 12, 2023 at 7:22 AM Daniel Borkmann <daniel@...earbox.net> wrote:
> > > >
> > > > On 9/12/23 3:33 PM, Willem de Bruijn wrote:
> > > > > Jordan Rife wrote:
> > > > >> commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
> > > > >> ensured that kernel_connect() will not overwrite the address parameter
> > > > >> in cases where BPF connect hooks perform an address rewrite. However,
> > > > >> there remain other cases where BPF hooks can overwrite an address held
> > > > >> by a kernel client.
> > > > >>
> > > > >> ==Scenarios Tested==
> > > > >>
> > > > >> * Code in the SMB and Ceph modules calls sock->ops->connect() directly,
> > > > >> allowing the address overwrite to occur. In the case of SMB, this can
> > > > >> lead to broken mounts.
> > > > >
> > > > > These should probably call kernel_connect instead.
> > > > >
> > > > >> * NFS v3 mounts with proto=udp call sock_sendmsg() for each RPC call,
> > > > >> passing a pointer to the mount address in msg->msg_name which is
> > > > >> later overwritten by a BPF sendmsg hook. This can lead to broken NFS
> > > > >> mounts.
> > > > >
> > > > > Similarly, this could call kernel_sendmsg, and the extra copy handled
> > > > > in that wrapper. The arguments are not exacty the same, so not 100%
> > > > > this is feasible.
> > > > >
> > > > > But it's preferable if in-kernel callers use the kernel_.. API rather
> > > > > than bypass it. Exactly for issues like the one you report.
> > > >
> > > > Fully agree, if it's feasible it would be better to convert them over to
> > > > in-kernel API.
> > > >
> > > > >> In order to more comprehensively fix this class of problems, this patch
> > > > >> pushes the address copy deeper into the stack and introduces an address
> > > > >> copy to both udp_sendmsg() and udpv6_sendmsg() to insulate all callers
> > > > >> from address rewrites.
> > > > >>
> > > > >> Signed-off-by: Jordan Rife <jrife@...gle.com>
Powered by blists - more mailing lists