[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADvbK_dQUpDa5oCo-o5DkKNY498gWwsan+RTpb9yTrg7DNRc+g@mail.gmail.com>
Date: Wed, 18 Jan 2023 20:18:07 -0500
From: Xin Long <lucien.xin@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: David Ahern <dsahern@...il.com>,
network dev <netdev@...r.kernel.org>, davem@...emloft.net,
kuba@...nel.org, Paolo Abeni <pabeni@...hat.com>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Pravin B Shelar <pshelar@....org>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
Pablo Neira Ayuso <pablo@...filter.org>,
Florian Westphal <fw@...len.de>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Ilya Maximets <i.maximets@....org>,
Aaron Conole <aconole@...hat.com>,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Mahesh Bandewar <maheshb@...gle.com>,
Paul Moore <paul@...l-moore.com>,
Guillaume Nault <gnault@...hat.com>
Subject: Re: [PATCH net-next 09/10] netfilter: get ipv6 pktlen properly in length_mt6
On Tue, Jan 17, 2023 at 10:47 AM Xin Long <lucien.xin@...il.com> wrote:
>
> On Mon, Jan 16, 2023 at 3:37 PM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Mon, Jan 16, 2023 at 8:10 PM Xin Long <lucien.xin@...il.com> wrote:
> > >
> > > On Mon, Jan 16, 2023 at 11:02 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > >
> > > > On Mon, Jan 16, 2023 at 4:08 PM David Ahern <dsahern@...il.com> wrote:
> > > > >
> > > >
> > > > > not sure why you think it would not be detected. Today's model for gro
> > > > > sets tot_len based on skb->len. There is an inherent trust that the
> > > > > user's of the gro API set the length correctly. If it is not, the
> > > > > payload to userspace would ultimately be non-sense and hence detectable.
> > > >
> > > > Only if you use some kind of upper protocol adding message integrity
> > > > verification.
> > > >
> > > > > >
> > > > > > As you said, user space sniffing packets now have to guess what is the
> > > > > > intent, instead of headers carrying all the needed information
> > > > > > that can be fully validated by parsers.
> > > > >
> > > > > This is a solveable problem within the packet socket API, and the entire
> > > > > thing is opt-in. If a user's tcpdump / packet capture program is out of
> > > > > date and does not support the new API for large packets, then that user
> > > > > does not have to enable large GRO/TSO.
> > > >
> > > > I do not see this being solved yet.
> > > I think it's common that we add a feature that is disabled by
> > > default in the kernel if the userspace might not support it.
> >
> > One critical feature for us is the ability to restrict packet capture
> > to the headers only.
> >
> > Privacy is a key requirement.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=3e5289d5e3f98b7b5b8cac32e9e5a7004c067436
> >
> > So please make sure that packet captures truncated to headers will
> > still be understood by tcpdump
> IIUC we can try to adjust iph->tot_len to 65535 or at least the length
> of all headers for IPv4 BIG TCP packets on the PACKET socket rcv
> path? (or even truncate the ipv4 BIG TCP packets there.). This way
> tcpdump will be able to parse all the headers.
>
> But what if some applications want all the raw data via PACKET sockets?
> Maybe adjust iph->tot_len only, not truncate the packet?
>
I think that IPv6 BIG TCP has a similar problem, below is the tcpdump in
my env (RHEL-8), and it breaks too:
19:43:59.964272 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
19:43:59.964282 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
19:43:59.964292 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
19:43:59.964300 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
19:43:59.964308 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
it doesn't show what we want from the TCP header either.
For the latest tcpdump on upstream, it can display headers well for
IPv6 BIG TCP. But we can't expect all systems to use the tcpdump
that supports HBH parsing.
For IPv4 BIG TCP, it's just a CFLAGS change to support it in "tcpdump,"
and 'tshark' even supports it by default.
I think we should NOT go with "adjust tot_len" or "truncate packets" way,
and it makes more sense to make it supported in "tcpdump" by default,
just like in "tshark". I believe commit [1] was added for some problems
they've met, we should enable it for both.
[1] https://github.com/the-tcpdump-group/tcpdump/commit/c8623960f050cb81c12b31107070021f27f14b18
Thanks.
Powered by blists - more mailing lists