[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMn4C-zwrTCGzKzuRYukxoqBa8tyHyFDwUSZYwkMOUJ4Lw@mail.gmail.com>
Date: Mon, 4 Dec 2023 15:24:09 -0500
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Davide Caratti <dcaratti@...hat.com>
Cc: Jiri Pirko <jiri@...nulli.us>, Xin Long <lucien.xin@...il.com>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>, wizhao@...hat.com,
Cong Wang <xiyou.wangcong@...il.com>, Florian Westphal <fw@...len.de>
Subject: Mirred broken WAS(Re: [PATCH net-next 2/2] act_mirred: use the
backlog for nested calls to mirred ingress
On Fri, Jan 20, 2023 at 12:02 PM Davide Caratti <dcaratti@...hat.com> wrote:
>
> William reports kernel soft-lockups on some OVS topologies when TC mirred
> egress->ingress action is hit by local TCP traffic [1].
> The same can also be reproduced with SCTP (thanks Xin for verifying), when
> client and server reach themselves through mirred egress to ingress, and
> one of the two peers sends a "heartbeat" packet (from within a timer).
>
> Enqueueing to backlog proved to fix this soft lockup; however, as Cong
> noticed [2], we should preserve - when possible - the current mirred
> behavior that counts as "overlimits" any eventual packet drop subsequent to
> the mirred forwarding action [3]. A compromise solution might use the
> backlog only when tcf_mirred_act() has a nest level greater than one:
> change tcf_mirred_forward() accordingly.
>
> Also, add a kselftest that can reproduce the lockup and verifies TC mirred
> ability to account for further packet drops after TC mirred egress->ingress
> (when the nest level is 1).
>
I am afraid this broke things. Here's a simple use case which causes
an infinite loop (that we found while testing blockcasting but
simplified to demonstrate the issue):
----
sudo ip netns add p4node
sudo ip link add p4port0 address 10:00:00:01:AA:BB type veth peer
port0 address 10:00:00:02:AA:BB
sudo ip link set dev port0 netns p4node
sudo ip a add 10.0.0.1/24 dev p4port0
sudo ip neigh add 10.0.0.2 dev p4port0 lladdr 10:00:00:02:aa:bb
sudo ip netns exec p4node ip a add 10.0.0.2/24 dev port0
sudo ip netns exec p4node ip l set dev port0 up
sudo ip l set dev p4port0 up
sudo ip netns exec p4node tc qdisc add dev port0 clsact
sudo ip netns exec p4node tc filter add dev port0 ingress protocol ip
prio 10 matchall action mirred ingress redirect dev port0
ping -I p4port0 10.0.0.2 -c 1
-----
reverting the patch fixes things and it gets caught by the nested
recursion check.
Frankly, I believe we should restore a proper ttl from what was removed here:
https://lore.kernel.org/all/1430765318-13788-1-git-send-email-fw@strlen.de/
The headaches(and time consumed) trying to save the 3-4 bits removing
the ttl field is not worth it imo.
cheers,
jamal
Powered by blists - more mailing lists