[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <175dd850-49c8-f4a9-bd9e-61b8b7482ed4@mellanox.com>
Date: Sun, 30 Jun 2019 08:43:00 +0000
From: Paul Blakey <paulb@...lanox.com>
To: Cong Wang <xiyou.wangcong@...il.com>
CC: Jiri Pirko <jiri@...lanox.com>, Roi Dayan <roid@...lanox.com>,
Yossi Kuperman <yossiku@...lanox.com>,
Oz Shlomo <ozsh@...lanox.com>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Aaron Conole <aconole@...hat.com>,
Zhike Wang <wangzhike@...com>,
Rony Efraim <ronye@...lanox.com>,
"nst-kernel@...hat.com" <nst-kernel@...hat.com>,
John Hurley <john.hurley@...ronome.com>,
Simon Horman <simon.horman@...ronome.com>,
Justin Pettit <jpettit@....org>
Subject: Re: [PATCH net-next v2 0/4] net/sched: Introduce tc connection
tracking
On 6/24/2019 8:59 PM, Cong Wang wrote:
> On Thu, Jun 20, 2019 at 6:43 AM Paul Blakey <paulb@...lanox.com> wrote:
>> Hi,
>>
>> This patch series add connection tracking capabilities in tc sw datapath.
>> It does so via a new tc action, called act_ct, and new tc flower classifier matching
>> on conntrack state, mark and label.
> Thanks for more detailed description here.
>
> I still don't see why we have to do this in L2, mind to be more specific?
tc is an complete datapath, and does it's routing/manipulation before
the kernel stack (here the hooks
are on device ingress qdisc), for example, take this simple namespace setup
#setup 2 reps
sudo ip netns add ns0
sudo ip netns add ns1
sudo ip link add vm type veth peer name vm_rep
sudo ip link add vm2 type veth peer name vm2_rep
sudo ip link set vm netns ns0
sudo ip link set vm2 netns ns1
sudo ip netns exec ns0 ifconfig vm 3.3.3.3/24 up
sudo ip netns exec ns1 ifconfig vm2 3.3.3.4/24 up
sudo ifconfig vm_rep up
sudo ifconfig vm2_rep up
sudo tc qdisc add dev vm_rep ingress
sudo tc qdisc add dev vm2_rep ingress
#outbound
sudo tc filter add dev vm_rep ingress proto ip chain 0 prio 1 flower
ct_state -trk action mirred egress redirect dev vm2_rep
sudo tc filter add dev vm_rep ingress proto ip chain 1 prio 1 flower
ct_state +trk+new action ct commit pipe action mirred egress redirect
dev vm2_rep
sudo tc filter add dev vm_rep ingress proto ip chain 1 prio 1 flower
ct_state +trk+est action mirred egress redirect dev vm2_rep
#inbound
sudo tc filter add dev vm2_rep ingress proto ip chain 0 prio 1 flower
ct_state -trk action mirred egress redirect dev vm_rep
sudo tc filter add dev vm2_rep ingress proto ip chain 1 prio 1 flower
ct_state +trk+est action mirred egress redirect dev vm_rep
#handle arps
sudo tc filter add dev vm2_rep ingress proto arp chain 0 prio 2 flower
action mirred egress redirect dev vm_rep
sudo tc filter add dev vm_rep ingress proto arp chain 0 prio 2 flower
action mirred egress redirect dev vm2_rep
#run traffic
sudo timeout 20 ip netns exec ns1 iperf -s&
sudo ip netns exec ns0 iperf -c 3.3.3.4 -t 10
The traffic is handled in tc datapath layer and the user here decided
how to route the packets.
In a real world exmaple, we are going to use it with SRIOV where the tc
rules are on representors, and the vms above are
SRIOV vfs attached to VMs. We also don't want to send any packet to
conntrack just those that we want,
and we might do manipulation on the packet before sending it to
conntrack such as with tc action pedit , in a router
setup (change macs, ips).
>
> IOW, if you really want to manipulate conntrack info and use it for
> matching, why not do it in netfilter layer as it is where conntrack is?
>
> BTW, if the cls_flower ct_state matching is not in upstream yet, please
> try to push it first, as it is a justification of this patchset.
>
> Thanks.
It's patch 3/4 of this patch set, I can move it to be first
Powered by blists - more mailing lists