[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <478ae732-161d-c692-b60a-6df11c37ac2c@gmail.com>
Date: Fri, 27 Aug 2021 09:47:26 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Vasily Averin <vvs@...tuozzo.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Christoph Paasch <christoph.paasch@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Jakub Kicinski <kuba@...nel.org>,
netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, kernel@...nvz.org,
Julian Wiedmann <jwi@...ux.ibm.com>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>
Subject: Re: [PATCH NET-NEXT] ipv6: skb_expand_head() adjust skb->truesize
incorrectly
On 8/27/21 8:23 AM, Vasily Averin wrote:
> I asked Alexey Kuznetsov to look at this problem. Below is his answer:
> "I think the current scheme is obsolete. It was created
> when we had only two kinds of skb accounting (rmem & wmem)
> and with more kinds of accounting it just does not work.
> Even there we had ignored problems with adjusting accounting.
>
> Logically the best solution would be replacing ->destructor,
> set_owner* etc with skb_ops. Something like:
>
> struct skb_ops
> {
> void init(struct sk_buff * skb, struct skb_ops * ops, struct
> sock * owner);
> void fini(struct sk_buff * skb);
> void update(struct sk_buff * skb, int adjust);
> void inherit(struct sk_buff * skb2, struct sk_buff * skb);
> };
>
> init - is replacement for skb_set_owner_r|w
> fini - is replacement for skb_orphan
> update - is new operation to be used in places where skb->truesize changes,
> instead of awful constructions like:
>
> if (!skb->sk || skb->destructor == sock_edemux)
> skb->truesize += size - osize;
>
> Now it will look like:
>
> if (skb->ops)
> skb->ops->update(skb, size - osize);
>
> inherit - is replacement for also awful constructs like:
>
> if (skb->sk)
> skb_set_owner_w(skb2, skb->sk);
>
> Now it will be:
>
> if (skb->ops)
> skb->ops->inherit(skb2, skb);
>
> The implementation looks mostly obvious.
> Some troubles can be only with new functionality:
> update of accounting was never done before.
>
>
> More efficient, functionally equivalent, but uglier and less flexible
> alternative would be removal of ->destructor, replaced with
> a small numeric indicator of ownership:
>
> enum
> {
> SKB_OWNER_NONE, /* aka destructor == NULL */
> SKB_OWNER_WMEM, /* aka destructor == sk_wfree */
> SKB_OWNER_RMEM, /* aka destructor == sk_rfree */
> SKB_OWNER_SK, /* aka destructor == sk_edemux */
> SKB_OWNER_TCP, /* aka destructor == tcp_wfree */
> }
>
> And the same init,fini,inherit,update become functions
> w/o any inidirect calls. Not sure it is really more efficient though."
>
Well, this does not look as stable material, and would add a bunch
of indirect calls which are quite expensive these days (CONFIG_RETPOLINE=y)
I suggest we work on a fix, using existing infra, then eventually later
try to refactor if this is really bringing improvements.
A fix could simply be a revert of 0c9f227bee119 ("ipv6: use skb_expand_head in ip6_xmit")
since only IPv6 has the problem (because of arbitrary headers size)
Powered by blists - more mailing lists