[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADvbK_f7Hq_uQP0dh6i1O6wAK4MMYx4FGCmGBqEoPWqO2sPpyQ@mail.gmail.com>
Date: Thu, 19 Jan 2023 13:57:39 -0500
From: Xin Long <lucien.xin@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: David Ahern <dsahern@...il.com>,
network dev <netdev@...r.kernel.org>, davem@...emloft.net,
kuba@...nel.org, Paolo Abeni <pabeni@...hat.com>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Pravin B Shelar <pshelar@....org>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
Pablo Neira Ayuso <pablo@...filter.org>,
Florian Westphal <fw@...len.de>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Ilya Maximets <i.maximets@....org>,
Aaron Conole <aconole@...hat.com>,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Mahesh Bandewar <maheshb@...gle.com>,
Paul Moore <paul@...l-moore.com>,
Guillaume Nault <gnault@...hat.com>
Subject: Re: [PATCH net-next 09/10] netfilter: get ipv6 pktlen properly in length_mt6
On Thu, Jan 19, 2023 at 1:10 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Thu, Jan 19, 2023 at 5:51 PM Xin Long <lucien.xin@...il.com> wrote:
> >
> > On Thu, Jan 19, 2023 at 10:41 AM David Ahern <dsahern@...il.com> wrote:
> > >
> > > On 1/18/23 8:13 PM, Eric Dumazet wrote:
> > > > On Thu, Jan 19, 2023 at 2:19 AM Xin Long <lucien.xin@...il.com> wrote:
> > > >
> > > >> I think that IPv6 BIG TCP has a similar problem, below is the tcpdump in
> > > >> my env (RHEL-8), and it breaks too:
> > > >>
> > > >> 19:43:59.964272 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
> > > >> 19:43:59.964282 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
> > > >> 19:43:59.964292 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
> > > >> 19:43:59.964300 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
> > > >> 19:43:59.964308 IP6 2001:db8:1::1 > 2001:db8:2::1: [|HBH]
> > > >>
> > > >
> > > > Please make sure to use a not too old tcpdump.
> > > >
> > > >> it doesn't show what we want from the TCP header either.
> > > >>
> > > >> For the latest tcpdump on upstream, it can display headers well for
> > > >> IPv6 BIG TCP. But we can't expect all systems to use the tcpdump
> > > >> that supports HBH parsing.
> > > >
> > > > User error. If an admin wants to diagnose TCP potential issues, it should use
> > > > a correct version.
> > >
> > > Both of those just fall under "if you want a new feature, update your
> > > tools."
> > >
> > >
> > > >
> > > >>
> > > >> For IPv4 BIG TCP, it's just a CFLAGS change to support it in "tcpdump,"
> > > >> and 'tshark' even supports it by default.
> > > >
> > > > Not with privacy _requirements_, where only the headers are captured.
> > > >
> > > > I am keeping a NACK, until you make sure you do not break this
> > > > important feature.
> > >
> > > I think the request here is to keep the snaplen in place (e.g., to make
> > > only headers visible to userspace) while also returning the >64kB packet
> > > length as meta data.
> > >
> > > My last pass on the packet socket code suggests this is possible;
> > > someone (Xin) needs to work through the details.
> > >
> > To be honest, I don't really like such a change in a packet socket,
> > I tried, and the code doesn't look nice.
> >
> > I'm thinking since skb->len is trustable, why don't we use
> > IP_MAX_MTU(0xFFFF) as iph->tot_len for IPv4 BIG TCP?
> > namely, only change these 2 helpers to:
> >
> > static inline unsigned int iph_totlen(const struct sk_buff *skb, const
> > struct iphdr *iph)
> > {
> > u16 len = ntohs(iph->tot_len);
> >
> > return (len < IP_MAX_MTU || !skb_is_gso_tcp(skb)) ? len :
> > skb->len - skb_network_offset(skb);
> > }
> >
> > static inline void iph_set_totlen(struct iphdr *iph, unsigned int len)
> > {
> > iph->tot_len = len < IP_MAX_MTU ? htons(len) : htons(IP_MAX_MTU);
> > }
> >
> > What do you think?
>
> I think this is a no go for me.
>
> I think I stated clearly what was the problem.
> If you care about TCP diagnostics, you want the truth, not truncated
> sequence ranges,
> making it impossible to know if a packet was sent.
Sorry Eric if I didn't get you well.
With new helpers, the iph->tot_len will be set to IP_MAX_MTU(65535),
all TCP headers will display well, no truncated sequence ranges:
# ip net exec router tcpdump -i link1
13:36:46.675522 IP 198.51.100.1.42289 > 203.0.113.1.45103: Flags [P.],
seq 1532642515:1532707998, ack 1, win 504, options [nop,nop,TS val
2975547125 ecr 2379476018], length 65483
13:36:46.675534 IP 198.51.100.1.42289 > 203.0.113.1.45103: Flags [P.],
seq 1532769005:1532834488, ack 1, win 504, options [nop,nop,TS val
2975547125 ecr 2379476018], length 65483
13:36:46.675542 IP 198.51.100.1.42289 > 203.0.113.1.45103: Flags [P.],
seq 1532895495:1532960978, ack 1, win 504, options [nop,nop,TS val
2975547125 ecr 2379476018], length 65483
13:36:46.675550 IP 198.51.100.1.42289 > 203.0.113.1.45103: Flags [P.],
seq 1533021985:1533087468, ack 1, win 504, options [nop,nop,TS val
2975547125 ecr 2379476018], length 65483
I just don't want to modify the iph tot_len in IPv4 header from the
raw data in the packet socket.
We're trying to avoid iph->tot_len too small for IPv4 BIG TCP to
display tcphdr in tcpdump, aren't we?
That's why I think using IP_MAX_MTU will avoid this.
>
> Without headers describing precisely payload length (solution taken in
> IPv6 BI TCP),
> you have to augment AF_PACKET to provide this information in
> additional meta-data.
For this, I agree to provide the >64kB packet into additional meta-data.
Thanks.
Powered by blists - more mailing lists