[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHsH6GvoDr5qOKsvvuShfHFi4CsCfaC-pUbxTE6OfYWhgTf9bg@mail.gmail.com>
Date: Tue, 19 Apr 2022 21:14:38 +0300
From: Eyal Birger <eyal.birger@...il.com>
To: Marcelo Ricardo Leitner <mleitner@...hat.com>
Cc: Hangbin Liu <liuhangbin@...il.com>, netdev@...r.kernel.org,
jhs@...atatu.com, xiyou.wangcong@...il.com, jiri@...nulli.us,
davem@...emloft.net, kuba@...nel.org, ahleihel@...hat.com,
dcaratti@...hat.com, aconole@...hat.com, roid@...dia.com,
Shmulik Ladkani <shmulik.ladkani@...il.com>
Subject: Re: [PATCH net] net: sched: act_mirred: Reset ct info when
mirror/redirect skb
Hi,
On Tue, Apr 19, 2022 at 8:26 PM Marcelo Ricardo Leitner
<mleitner@...hat.com> wrote:
>
> Hi,
>
> On Tue, Apr 19, 2022 at 07:50:38PM +0300, Eyal Birger wrote:
> > Hi,
> >
> > On Mon, Aug 9, 2021 at 1:29 PM <patchwork-bot+netdevbpf@...nel.org> wrote:
> > >
> > > Hello:
> > >
> > > This patch was applied to netdev/net.git (refs/heads/master):
> > >
> > > On Mon, 9 Aug 2021 15:04:55 +0800 you wrote:
> > > > When mirror/redirect a skb to a different port, the ct info should be reset
> > > > for reclassification. Or the pkts will match unexpected rules. For example,
> > > > with following topology and commands:
> > > >
> > > > -----------
> > > > |
> > > > veth0 -+-------
> > > > |
> > > > veth1 -+-------
> > > > |
> > > >
> > > > [...]
> > >
> > > Here is the summary with links:
> > > - [net] net: sched: act_mirred: Reset ct info when mirror/redirect skb
> > > https://git.kernel.org/netdev/net/c/d09c548dbf3b
> >
> > Unfortunately this commit breaks DNAT when performed before going via mirred
> > egress->ingress.
> >
> > The reason is that connection tracking is lost and therefore a new state
> > is created on ingress.
> >
> > This breaks existing setups.
> >
> > See below a simplified script reproducing this issue.
>
> I guess I can understand why the reproducer triggers it, but I fail to
> see the actual use case you have behind it. Can you please elaborate
> on it?
One use case we use mirred egress->ingress redirect for is when we want to
reroute a packet after applying some change to the packet which would affect
its routing. for example consider a bpf program running on tc ingress (after
mirred) setting the skb->mark based on some criteria.
So you have something like:
packet routed to dummy device based on some criteria ->
mirred redirect to ingress ->
classification by ebpf logic at tc ingress ->
packet routed again
We have a setup where DNAT is performed before this flow in that case the
ebpf logic needs to see the packet after the NAT.
Eyal.
>
> >
> > Therefore I suggest this commit be reverted and a knob is introduced to mirred
> > for clearing ct as needed.
> >
> > Eyal.
> >
> > Reproduction script:
> >
> > #!/bin/bash
> >
> > ip netns add a
> > ip netns add b
> >
> > ip netns exec a sysctl -w net.ipv4.conf.all.forwarding=1
> > ip netns exec a sysctl -w net.ipv4.conf.all.accept_local=1
> >
> > ip link add veth0 netns a type veth peer name veth0 netns b
> > ip -net a link set veth0 up
> > ip -net a addr add dev veth0 198.51.100.1/30
> >
> > ip -net a link add dum0 type dummy
> > ip -net a link set dev dum0 up
> > ip -net a addr add dev dum0 198.51.100.2/32
> >
> > ip netns exec a iptables -t nat -I OUTPUT -d 10.0.0.1 -j DNAT
> > --to-destination 10.0.0.2
> > ip -net a route add default dev dum0
> > ip -net a rule add pref 50 iif dum0 lookup 1000
> > ip -net a route add table 1000 default dev veth0
> >
> > ip netns exec a tc qdisc add dev dum0 clsact
> > ip netns exec a tc filter add dev dum0 parent ffff:fff3 prio 50 basic
> > action mirred ingress redirect dev dum0
> >
> > ip -net b link set veth0 up
> > ip -net b addr add 10.0.0.2/32 dev veth0
> > ip -net b addr add 198.51.100.3/30 dev veth0
> >
> > ip netns exec a ping 10.0.0.1
> > >
> > > You are awesome, thank you!
> > > --
> > > Deet-doot-dot, I am a bot.
> > > https://korg.docs.kernel.org/patchwork/pwbot.html
> > >
> > >
> >
>
Powered by blists - more mailing lists