[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210921142533.1403e537@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Tue, 21 Sep 2021 14:25:33 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Vasily Averin <vvs@...tuozzo.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org,
Christoph Paasch <christoph.paasch@...il.com>,
Hao Sun <sunhao.th@...il.com>, kernel@...nvz.org
Subject: Re: [RFC net v7] net: skb_expand_head() adjust skb->truesize
incorrectly
On Tue, 21 Sep 2021 09:36:26 +0300 Vasily Averin wrote:
> >> However I think we can do it later,
> >> right now we need to fix somehow broken skb_expand_head(),
> >> please take look at v8.
> >
> > I think v8 still has the issue that Eric was explaining over and over.
>
> I've missed sock_edemux check, however I do not see any other issues.
> Could you please explain what problem you talking about?
>
> Eric said:
> "it is not valid to call skb_set_owner_w(skb, sk) on all kind of sockets",
> because socket might have been closed already.
>
> Before the call we have old skb with sk reference, so sk is not closed yet
> and have nonzero sk->sk_wmem_alloc.
>
> During the call, skb_set_owner_w calls skb_orphan that calls old skb destructor.
> Yes, it can decrement last sk reference and release the socket,
> and I think this is exactly the problem that Eric was pointing out:
> now sk access is unsafe.
>
> However it can be prevented in at least 2 ways:
> a) clone old skb and call skb_set_owner_w(nskb, sk) before skb_consume(oskb).
> In this case, skb_orphan does not call old destructor, because at this point
> nskb->sk = NULL and nskb->destructor = NULL, and sk reference is kept by oskb.
> This is widely used in current code (ppp_xmit, ipip6_tunnel_xmit,
> ip_vs_prepare_tunneled_skb and so on).
> This is used in v8 too.
> b) Alternatively, extra refs on sk->sk_wmem_alloc and sk->sk_refcnt can be
> carefully taken before skb_set_owner_w() call. These references will not allow
> to release sk during old destructor's execution.
> This was used in v6, and I think this should works correctly too.
>
> Could you please explain where I am wrong?
> Do you talking about some other issue perhaps?
I'm not particularly interested in being part of the arguing here.
If Eric acks your code it will be applied. I can do my cleanups on top.
Powered by blists - more mailing lists