[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADKFtnTzqLw4F2m9+7BxZZW_QKm_QiduMb6to9mU1WAvbo9MWQ@mail.gmail.com>
Date: Tue, 12 Sep 2023 14:08:59 -0700
From: Jordan Rife <jrife@...gle.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Daniel Borkmann <daniel@...earbox.net>, davem@...emloft.net,
Eric Dumazet <edumazet@...gle.com>, kuba@...nel.org, pabeni@...hat.com,
netdev@...r.kernel.org, dborkman@...nel.org
Subject: Re: [PATCH net] net: prevent address overwrite in connect() and sendmsg()
> If we take this path, it could be a single patch. The subsystem
> maintainers should be CC:ed so that they can (N)ACK it.
>
> But I do not mean to ask to split it up and test each one separately.
>
> The change from sock->ops->connect to kernel_connect is certainly
> trivial enough that compile testing should suffice.
Ack. Thanks for clarifying.
> The only question is whether we should pursue your original patch and
> accept that this will continue, or one that improves the situation,
> but touches more files and thus has a higher risk of merge conflicts.
>
> I'd like to give others some time to chime in. I've given my opinion,
> but it's only one.
>
> I'd like to give others some time to chime in. I've given my opinion,
> but it's only one.
Sounds good. I'll wait to hear others' opinions on the best path forward.
-Jordan
On Tue, Sep 12, 2023 at 1:49 PM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> On Tue, Sep 12, 2023 at 4:07 PM Jordan Rife <jrife@...gle.com> wrote:
> >
> > > Ideally these all would use the proper kernel-internal APIs.
> > > I care less about new code. If all examples are clear, that will do the right thing, or is a simple fix-up worst case.
> >
> > Fair enough.
> >
> > > The changes do seem like trivial one liners?
> >
> > Looks like it. Still, it's going to take some time to patch+test each
> > of these. I'll start with NFS (SUNRPC), SMB, and Ceph, since I can
> > easily test them.
> >
> > > Question is if changing all these callers is suitable for a patch targeting net.
> > One patch to net will be needed to add an address copy to
> > kernel_sendmsg(), but I guess I'll need to send some other patches to
> > the appropriate subsystem maintainers for SUNRPC, SMB, and Ceph to
> > swap out calls.
>
> If we take this path, it could be a single patch. The subsystem
> maintainers should be CC:ed so that they can (N)ACK it.
>
> But I do not mean to ask to split it up and test each one separately.
>
> The change from sock->ops->connect to kernel_connect is certainly
> trivial enough that compile testing should suffice.
>
> Note that kernel_connect actually uses READ_ONCE(sock->ops) because of
> a data race. All callers that call a socket that may be subject to
> IPV6_ADDRFORM should do that. Likely some of those open coded examples
> are affected, and do not. This is another example why using a single
> interface is preferable.
>
> > > Similarly, this could call kernel_sendmsg, and the extra copy handled in that wrapper.
> >
> > Would it make more sense to do the address copy in sock_sendmsg()
> > instead? Are kernel callers "supposed" to use kernel_sendmsg() or is
> > it valid for them to use sock_sendmsg() as many of them seem to want
> > to do today. Seeing as sock_sendmsg() is called by kernel_sendmsg(),
> > adding the copy in sock_sendmsg() fixes this problem for callers to
> > both of these and avoids the need for patching all modules that call
> > sock_sendmsg().
>
> I think it is "supposed" to be the API for these cases. But as you
> show, clearly it isn't today. Practice trumps theory.
>
> The only question is whether we should pursue your original patch and
> accept that this will continue, or one that improves the situation,
> but touches more files and thus has a higher risk of merge conflicts.
>
> I'd like to give others some time to chime in. I've given my opinion,
> but it's only one.
>
>
>
> > -Jordan
> >
> > On Tue, Sep 12, 2023 at 12:36 PM Willem de Bruijn
> > <willemdebruijn.kernel@...il.com> wrote:
> > >
> > > On Tue, Sep 12, 2023 at 2:31 PM Jordan Rife <jrife@...gle.com> wrote:
> > > >
> > > > > These should probably call kernel_connect instead.
> > > > > Similarly, this could call kernel_sendmsg, and the extra copy handled in that wrapper.
> > > >
> > > > I was considering this approach initially, but one concern I had was
> > > > that there are other instances I see of modules calling ops->connect()
> > > > directly (bypassing kernel_connect()):
> > > >
> > > > - net/netfilter/ipvs/ip_vs_sync.c: make_send_sock()
> > > > - drivers/block/drbd/drbd_receiver.c: drbd_try_connect()
> > > > - drivers/infiniband/hw/erdma/erdma_cm.c: kernel_bindconnect()
> > > > - drivers/infiniband/sw/siw/siw_cm.c: kernel_bindconnect()
> > > > - fs/ocfs2/cluster/tcp.c: o2net_start_connect()
> > > > - net/rds/tcp_connect.c: rds_tcp_conn_path_connect()
> > > >
> > > > and ops->sendmsg():
> > > >
> > > > - net/smc/af_smc.c: smc_sendmsg()
> > > > - drivers/vhost/net.c: vhost_tx_batch(), handle_tx_copy(), handle_tx_zerocopy()
> > > > - drivers/target/iscsi/iscsi_target_util.c: iscsit_fe_sendpage_sg()
> > > >
> > > > which (at least in theory) leaves them open to the same problem I'm
> > > > seeing with NFS/SMB right now. I worry that even if all these
> > > > instances were swapped out with kernel_sendmsg() and kernel_connect(),
> > > > it would turn into a game of whac-a-mole in the future as new changes
> > > > or new modules may reintroduce direct calls to sock->ops->connect() or
> > > > sock->ops->sendmsg().
> > >
> > > Ideally these all would use the proper kernel-internal APIs.
> > >
> > > I care less about new code. If all examples are clear, that
> > > will do the right thing, or is a simple fix-up worst case.
> > >
> > > Question is if changing all these callers is suitable for a patch
> > > targeting net.
> > >
> > > The changes do seem like trivial one liners?
> > >
> > > > -Jordan
> > > >
> > > >
> > > >
> > > > On Tue, Sep 12, 2023 at 7:22 AM Daniel Borkmann <daniel@...earbox.net> wrote:
> > > > >
> > > > > On 9/12/23 3:33 PM, Willem de Bruijn wrote:
> > > > > > Jordan Rife wrote:
> > > > > >> commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
> > > > > >> ensured that kernel_connect() will not overwrite the address parameter
> > > > > >> in cases where BPF connect hooks perform an address rewrite. However,
> > > > > >> there remain other cases where BPF hooks can overwrite an address held
> > > > > >> by a kernel client.
> > > > > >>
> > > > > >> ==Scenarios Tested==
> > > > > >>
> > > > > >> * Code in the SMB and Ceph modules calls sock->ops->connect() directly,
> > > > > >> allowing the address overwrite to occur. In the case of SMB, this can
> > > > > >> lead to broken mounts.
> > > > > >
> > > > > > These should probably call kernel_connect instead.
> > > > > >
> > > > > >> * NFS v3 mounts with proto=udp call sock_sendmsg() for each RPC call,
> > > > > >> passing a pointer to the mount address in msg->msg_name which is
> > > > > >> later overwritten by a BPF sendmsg hook. This can lead to broken NFS
> > > > > >> mounts.
> > > > > >
> > > > > > Similarly, this could call kernel_sendmsg, and the extra copy handled
> > > > > > in that wrapper. The arguments are not exacty the same, so not 100%
> > > > > > this is feasible.
> > > > > >
> > > > > > But it's preferable if in-kernel callers use the kernel_.. API rather
> > > > > > than bypass it. Exactly for issues like the one you report.
> > > > >
> > > > > Fully agree, if it's feasible it would be better to convert them over to
> > > > > in-kernel API.
> > > > >
> > > > > >> In order to more comprehensively fix this class of problems, this patch
> > > > > >> pushes the address copy deeper into the stack and introduces an address
> > > > > >> copy to both udp_sendmsg() and udpv6_sendmsg() to insulate all callers
> > > > > >> from address rewrites.
> > > > > >>
> > > > > >> Signed-off-by: Jordan Rife <jrife@...gle.com>
Powered by blists - more mailing lists