netdev - Re: [PATCH v2 net-next 08/14] ipv6: Add hop-by-hop header to jumbograms in ip6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iL2gXRsnC20a+=YJ+Ug=3x_jacmtL+S269_0g+E0wDYSQ@mail.gmail.com>
Date:   Fri, 4 Mar 2022 09:47:52 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     David Ahern <dsahern@...nel.org>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        netdev <netdev@...r.kernel.org>, Coco Li <lixiaoyan@...gle.com>,
        Alexander Duyck <alexanderduyck@...com>
Subject: Re: [PATCH v2 net-next 08/14] ipv6: Add hop-by-hop header to
 jumbograms in ip6_output

On Thu, Mar 3, 2022 at 8:33 PM David Ahern <dsahern@...nel.org> wrote:
>
> On 3/3/22 11:16 AM, Eric Dumazet wrote:
> > From: Coco Li <lixiaoyan@...gle.com>
> >
> > Instead of simply forcing a 0 payload_len in IPv6 header,
> > implement RFC 2675 and insert a custom extension header.
> >
> > Note that only TCP stack is currently potentially generating
> > jumbograms, and that this extension header is purely local,
> > it wont be sent on a physical link.
> >
> > This is needed so that packet capture (tcpdump and friends)
> > can properly dissect these large packets.
> >
>
>
> I am fairly certain I know how you are going to respond, but I will ask
> this anyways :-) :
>
> The networking stack as it stands today does not care that skb->len >
> 64kB and nothing stops a driver from setting max gso size to be > 64kB.
> Sure, packet socket apps (tcpdump) get confused but if the h/w supports
> the larger packet size it just works.

Observability is key. "just works" is a bold claim.

>
> The jumbogram header is getting adding at the L3/IPv6 layer and then
> removed by the drivers before pushing to hardware. So, the only benefit
> of the push and pop of the jumbogram header is for packet sockets and
> tc/ebpf programs - assuming those programs understand the header
> (tcpdump (libpcap?) yes, random packet socket program maybe not). Yes,
> it is a standard header so apps have a chance to understand the larger
> packet size, but what is the likelihood that random apps or even ebpf
> programs will understand it?

Can you explain to me what you are referring to by " random apps" exactly ?
TCP does not expose to user space any individual packet length.



>
> Alternative solutions to the packet socket (ebpf programs have access to
> skb->len) problem would allow IPv4 to join the Big TCP party. I am
> wondering how feasible an alternative solution is to get large packet
> sizes across the board with less overhead and changes.

You know, I think I already answered this question 6 months ago.

We need to carry an extra metadata to carry how much TCP payload is in a packet,
both on RX and TX side.

Adding an skb field for that was not an option for me.

Adding a 8 bytes header is basically free, the headers need to be in cpu caches
when the header is added/removed.

This is zero cost on current cpus, compared to the gains.

I think you focus on TSO side, which is only 25% of the possible gains
that BIG TCP was seeking for.

We covered both RX and TX with a common mechanism.