[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHC9VhQZ+k1J0UidJ-bgdBGBuVX9M18tQ+a+fuqXQM_L-PFvzA@mail.gmail.com>
Date: Wed, 12 Feb 2025 11:15:41 -0500
From: Paul Moore <paul@...l-moore.com>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: Eric Dumazet <edumazet@...gle.com>, netdev@...r.kernel.org,
Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, David Ahern <dsahern@...nel.org>, Xiumei Mu <xmu@...hat.com>,
linux-security-module@...r.kernel.org
Subject: Re: [PATCH net] tcp: drop skb extensions before skb_attempt_defer_free
On Tue, Feb 11, 2025 at 1:51 PM Sabrina Dubroca <sd@...asysnail.net> wrote:
> 2025-02-10, 17:24:43 +0100, Eric Dumazet wrote:
> > On Mon, Feb 10, 2025 at 5:02 PM Sabrina Dubroca <sd@...asysnail.net> wrote:
> > >
> > > Xiumei reported hitting the WARN in xfrm6_tunnel_net_exit while
> > > running tests that boil down to:
> > > - create a pair of netns
> > > - run a basic TCP test over ipcomp6
> > > - delete the pair of netns
> > >
> > > The xfrm_state found on spi_byaddr was not deleted at the time we
> > > delete the netns, because we still have a reference on it. This
> > > lingering reference comes from a secpath (which holds a ref on the
> > > xfrm_state), which is still attached to an skb. This skb is not
> > > leaked, it ends up on sk_receive_queue and then gets defer-free'd by
> > > skb_attempt_defer_free.
> > >
> > > The problem happens when we defer freeing an skb (push it on one CPU's
> > > defer_list), and don't flush that list before the netns is deleted. In
> > > that case, we still have a reference on the xfrm_state that we don't
> > > expect at this point.
> > >
> > > tcp_eat_recv_skb is currently the only caller of skb_attempt_defer_free,
> > > so I'm fixing it here. This patch also adds a DEBUG_NET_WARN_ON_ONCE
> > > in skb_attempt_defer_free, to make sure we don't re-introduce this
> > > problem.
> > >
> > > Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
> > > Reported-by: Xiumei Mu <xmu@...hat.com>
> > > Signed-off-by: Sabrina Dubroca <sd@...asysnail.net>
> > > ---
> > > A few comments:
> > > - AFAICT this could not happen before 68822bdf76f1, since we would
> > > have emptied the (per-socket) defer_list before getting to ->exit()
> > > for the netns
> > > - I thought about dropping the extensions at the same time as we
> > > already drop the dst, but Paolo said this is probably not correct due
> > > to IP_CMSG_PASSSEC
> >
> > I think we discussed this issue in the past.
> >
> > Are you sure IP_CMSG_PASSSEC is ever used by TCP ?
>
> After checking, I don't think so. The only way TCP can get to
> IP_CMSG_PASSSEC is through the error queue, so it shouldn't matter.
>
> The original commit (2c7946a7bf45 ("[SECURITY]: TCP/UDP getpeersec"))
> also says that TCP should be using SO_PEERSEC for that purpose
> (although likely based on the secpath as well, but not packet per
> packet).
>
> Based on the chat you had with Paul Moore back in November, it seems
> any point after tcp_filter should be fine:
> https://lore.kernel.org/netdev/CAHC9VhS3yuwrOPcH5_iRy50O_TtBCh_OVWHZgzfFTYqyfrw_zQ@mail.gmail.com
[NOTE: CC'ing the LSM so others are aware of this discussion]
I can't say I've got the full context for this discussion, but my
comments from last November should still hold true: if a packet/skb is
still "alive" and able to be accessed by a task then we need to
preserve it from a LSM perspective.
As far as IP_CMSG_PASSSEC is concerned, while I would normally expect
TCP/stream connections to use the SO_PEERSEC method to determine the
peer security information, unless there is something in the stack or
socket API that explicitly prevents the use of IP_CMSG_PASSSEC on
UDP/datagram packets I would imagine it is possible that there is an
application somewhere which makes use of it. The LSM framework, and
all of the LSMs that implement the IP_CMSG_PASSSEC hook callback,
should work just fine regardless of if the packet is TCP or UDP.
--
paul-moore.com
Powered by blists - more mailing lists