[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4f805210-8f70-01dc-c0e9-4e573875eeab@linogate.de>
Date: Thu, 26 Jan 2023 16:05:52 +0100
From: Wolfgang Nothdurft <wolfgang@...ogate.de>
To: Florian Westphal <fw@...len.de>
Cc: steffen.klassert@...unet.com, netdev@...r.kernel.org
Subject: Re: [PATCH net] xfrm: remove inherited bridge info from skb
Am 26.01.23 um 14:55 schrieb Florian Westphal:
> wolfgang@...ogate.de <wolfgang@...ogate.de> wrote:
>> From: Wolfgang Nothdurft <wolfgang@...ogate.de>
>>
>> When using a xfrm interface in a bridged setup (the outgoing device is
>> bridged), the incoming packets in the xfrm interface inherit the bridge
>> info an confuses the netfilter connection tracking.
>>
>> brctl show
>> bridge name bridge id STP enabled interfaces
>> br_eth1 8000.000c29fe9646 no eth1
>>
>> This messes up the connection tracking so that only the outgoing packets
>> show up and the connections through the xfrm interface are UNREPLIED.
>> When using stateful netfilter rules, the response packet will be blocked
>> as state invalid.
>
> How does that mess up connection tracking?
> Can you explain further?
By "messed up" I meant that the reply packets were not showing up. I'm
not that far into the netfilter code, so my guess was that the packets
could not be mapped due to the additional bridge info.
I had this problem before with the KLIPS ipsec interface from the
Libreswan project. It appeared by the change from kernel 4.4 to 5.10.
Probably due to connection tracking support for bridge:
https://lwn.net/Articles/787195/.
I used the same as workaround here.
>
>> telnet 192.168.12.1 7
>> Trying 192.168.12.1...
>>
>> conntrack -L
>> tcp 6 115 SYN_SENT src=192.168.11.1 dst=192.168.12.1 sport=52476
>> dport=7 packets=2 bytes=104 [UNREPLIED] src=192.168.12.1
>> dst=192.168.11.1 sport=7 dport=52476 packets=0 bytes=0 mark=0
>> secctx=system_u:object_r:unlabeled_t:s0 use=1
>>
>> Chain INPUT (policy DROP 0 packets, 0 bytes)
>> 2 104 DROP_invalid all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
>>
>> Jan 26 09:28:12 defendo kernel: fw-chk drop [STATE=invalid] IN=ipsec0
>> OUT= PHYSIN=eth1 MAC= SRC=192.168.12.1 DST=192.168.11.1 LEN=52 TOS=0x00
>> PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=7 DPT=52476 WINDOW=64240 RES=0x00
>> ACK SYN URGP=0 MARK=0x1000000
>
> So it looks like for some reason reply packets are not passed to
> conntrack.
>
>> This patch removes the bridge info from the incoming packets on the xfrm
>> interface, so the packet can be properly assigned to the connection.
>
> To me it looks like this is papering over the real problem, whatever
> that is.
>
>> + /* strip bridge info from skb */
>> + if (skb_ext_exist(skb, SKB_EXT_BRIDGE_NF))
>> + skb_ext_del(skb, SKB_EXT_BRIDGE_NF);
>
> skb_ext_del(skb, SKB_EXT_BRIDGE_NF) would be enough, no need for a
> conditional, but this only builds with CONFIG_BRIDGE_NETFILTER=y.
>
> Does this work too?
Yes, this work also.
> diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
> index f20f4373ff40..9554abcfd5b4 100644
> --- a/net/bridge/br_netfilter_hooks.c
> +++ b/net/bridge/br_netfilter_hooks.c
> @@ -871,6 +871,7 @@ static unsigned int ip_sabotage_in(void *priv,
> if (nf_bridge && !nf_bridge->in_prerouting &&
> !netif_is_l3_master(skb->dev) &&
> !netif_is_l3_slave(skb->dev)) {
> + nf_bridge_info_free(skb);
> state->okfn(state->net, state->sk, skb);
> return NF_STOLEN;
> }
>
>
> This is following guess:
>
> 1. br netfilter on, so when first (encrypted) packet is received on eth1,
> conntrack is called from br_netfilter, which allocated nf_bridge info
> for skb.
> 2. Packet is for local machine, so passed on to ip stack from bridge
> 3. Packet passes through ip prerouting a second time, but br_netfilter
> ip_sabotage_in supresses the re-invocation of the hooks
> 4. Packet passes to xfrm for decryption
> 5. Packet appears on network stack again, this time after decryption
> 6. ip_sabotage_in prevents re-invocation of netfilter hooks because
> packet allegedly already passed them once (from br_netfilter).
> But the br_netfilter packet seen was before decryption, so conntrack
> never saw the syn-ack.
>
> I think the correct solution is to disable ip_sabotage_in() after it
> suppressed the call once.
>
Powered by blists - more mailing lists