[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAVpQUAQVVwPo5fi6rCcJQhH=jqQnkNaAMrBSJju85taEiTdkQ@mail.gmail.com>
Date: Mon, 14 Jul 2025 09:23:40 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Alexandra Winter <wintera@...ux.ibm.com>
Cc: "D. Wythe" <alibuda@...ux.alibaba.com>, Dust Li <dust.li@...ux.alibaba.com>,
Sidraya Jayagond <sidraya@...ux.ibm.com>, Wenjia Zhang <wenjia@...ux.ibm.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Mahanta Jambigi <mjambigi@...ux.ibm.com>, Tony Lu <tonylu@...ux.alibaba.com>,
Wen Gu <guwen@...ux.alibaba.com>, Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-s390@...r.kernel.org,
syzbot+40bf00346c3fe40f90f2@...kaller.appspotmail.com,
syzbot+f22031fad6cbe52c70e7@...kaller.appspotmail.com,
syzbot+271fed3ed6f24600c364@...kaller.appspotmail.com
Subject: Re: [PATCH v1 net] smc: Fix various oops due to inet_sock type confusion.
On Mon, Jul 14, 2025 at 12:42 AM Alexandra Winter <wintera@...ux.ibm.com> wrote:
> On 11.07.25 08:07, Kuniyuki Iwashima wrote:
> > syzbot reported weird splats [0][1] in cipso_v4_sock_setattr() while
> > freeing inet_sk(sk)->inet_opt.
> >
> > The address was freed multiple times even though it was read-only memory.
> >
> > cipso_v4_sock_setattr() did nothing wrong, and the root cause was type
> > confusion.
> >
> > The cited commit made it possible to create smc_sock as an INET socket.
> >
> > The issue is that struct smc_sock does not have struct inet_sock as the
> > first member but hijacks AF_INET and AF_INET6 sk_family, which confuses
> > various places.
> >
> > In this case, inet_sock.inet_opt was actually smc_sock.clcsk_data_ready(),
> > which is an address of a function in the text segment.
> >
> > $ pahole -C inet_sock vmlinux
> > struct inet_sock {
> > ...
> > struct ip_options_rcu * inet_opt; /* 784 8 */
> >
> > $ pahole -C smc_sock vmlinux
> > struct smc_sock {
> > ...
> > void (*clcsk_data_ready)(struct sock *); /* 784 8 */
> >
> > The same issue for another field was reported before. [2][3]
> >
> > At that time, an ugly hack was suggested [4], but it makes both INET
> > and SMC code error-prone and hard to change.
> >
> > Also, yet another variant was fixed by a hacky commit 98d4435efcbf3
> > ("net/smc: prevent NULL pointer dereference in txopt_get").
> >
> > Instead of papering over the root cause by such hacks, we should not
> > allow non-INET socket to reuse the INET infra.
> >
> > Let's add inet_sock as the first member of smc_sock.
> >
> [...]
> >
> > static struct lock_class_key smc_key;
> > diff --git a/net/smc/smc.h b/net/smc/smc.h
> > index 78ae10d06ed2e..2c90849637398 100644
> > --- a/net/smc/smc.h
> > +++ b/net/smc/smc.h
> > @@ -283,10 +283,10 @@ struct smc_connection {
> > };
> >
> > struct smc_sock { /* smc sock container */
> > - struct sock sk;
> > -#if IS_ENABLED(CONFIG_IPV6)
> > - struct ipv6_pinfo *pinet6;
> > -#endif
> > + union {
> > + struct sock sk;
> > + struct inet_sock icsk_inet;
> > + };
> > struct socket *clcsock; /* internal tcp socket */
> > void (*clcsk_state_change)(struct sock *sk);
> > /* original stat_change fct. */
>
> I would like to remind us of the discussions August 2024 around a patchset
> called "net/smc: prevent NULL pointer dereference in txopt_get".
> That discussion eventually ended up in the reduced (?)
> commit 98d4435efcbf ("net/smc: prevent NULL pointer dereference in txopt_get")
> without a union.
>
> I still think this union looks dangerous, but don't understand the code well enough to
> propose an alternative.
>
> Maybe incorporate inet_sock in smc_sock? Like Paoplo suggested in
> https://lore.kernel.org/lkml/20240815043714.38772-1-aha310510@gmail.com/T/#maf6ee926f782736cb6accd2ba162dea0a34e02f9
>
> He also asked for at least some explanatory comments in the union. Which would help me as well.
I agree with Paolo that smc_sock should "eventually" have only
inet_sock as the first member, but I think this should/can be done
in net-next as follow-up.
The thread above shows such code churn was distracting enough
and the improper fix was introduced.
Powered by blists - more mailing lists