[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aGPIT00m9THn8ABO@strlen.de>
Date: Tue, 1 Jul 2025 13:36:47 +0200
From: Florian Westphal <fw@...len.de>
To: Eric Woudstra <ericwouds@...il.com>
Cc: Pablo Neira Ayuso <pablo@...filter.org>,
Jozsef Kadlecsik <kadlec@...filter.org>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Ido Schimmel <idosch@...dia.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, netfilter-devel@...r.kernel.org,
bridge@...ts.linux.dev, netdev@...r.kernel.org
Subject: Re: [PATCH v12 nf-next 1/2] netfilter: bridge: Add conntrack double
vlan and pppoe
Eric Woudstra <ericwouds@...il.com> wrote:
> > Adding offset to skb->network_header during the call to
> > nf_conntrack_in() does not work, but, as you mentioned, adding the
> > offset through the nf_conntrack_inner() function, that does work. Except
> > for 1 piece of code, I found so far:
>
> A small correction, Adding offset to skb->network_header during to call
> to nf_conntrack_in() also works. Then skb->network_header can be
> restored after this call and nf_conntrack_inner() is not needed.
Good, thats even better.
> > nf_checksum() reports an error when it is called from
> > nf_conntrack_tcp_packet(). It also uses ip_hdr(skb) and ipv6_hdr(skb).
> > Strangely, It only gives the error when dealing with a pppoe packet or
> > pppoe-in-q packet. There is no error when q-in-q (double q) or 802.1ad
> > are involved.
> >
> > Do you have any suggestion how you want to handle this failure in
> > nf_checksum()?
I suspect nf_checksum() assumes skb->data points to network header.
Several places in netfilter assume this, which is the reason for all the
skb pull/push kludges in br_netfilter_hooks.c :-/
git grep -- 'skb->data' net/netfilter net/*/netfilter | wc -l
66
(not all of those are going to be an issue, such as ipvs).
Some callers do this:
if (nf_ip_checksum(skb, hooknum, hdrlen, IPPROTO_ICMP))
where hdrlen is the size of the ipv4 header.
That won't do the right thing when skb->data isn't identical to the
start of the ipv4 header.
Others do this:
if (nf_ip_checksum(skb, nft_hook(pkt), thoff, IPPROTO_TCP)) {
... where thoff is set via nft_set_pktinfo_ipv4(), so it *might*
be correct if nft_do_chain_bridge() is updated to follow l2 encap
trail (switch nft_do_chain_bridge() to use the flow dissector?).
but in some places thoff comes from this:
thoff = ipv6_skip_exthdr(skb, ((u8*)(ip6h+1) - skb->data), &proto, &fo);
... which should have the right offset regardless of skb->data is.
So AFAICS the initial step has to be to go through conntrack (and all
conntrack helpers) and get rid of all 'skb->data is l3 header' assumptions.
Then repeat for nat engine, then for nf_tables, then for helpers such as
the nf checksum functions.
IPVS, ipset and xtables can be left as-is AFAICS as they will only see
packets coming from ip stack.
Powered by blists - more mailing lists