[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f7tk17wyj25.fsf@dhcp-25.97.bos.redhat.com>
Date: Mon, 18 Nov 2019 16:24:18 -0500
From: Aaron Conole <aconole@...hat.com>
To: Paul Blakey <paulb@...lanox.com>
Cc: Roi Dayan <roid@...lanox.com>,
"netdev\@vger.kernel.org" <netdev@...r.kernel.org>,
Pravin B Shelar <pshelar@....org>,
"David S . Miller" <davem@...emloft.net>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
"dev\@openvswitch.org" <dev@...nvswitch.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net 2/2] act_ct: support asymmetric conntrack
Paul Blakey <paulb@...lanox.com> writes:
> On 11/14/2019 4:22 PM, Roi Dayan wrote:
>>
>> On 2019-11-08 11:07 PM, Aaron Conole wrote:
>>> The act_ct TC module shares a common conntrack and NAT infrastructure
>>> exposed via netfilter. It's possible that a packet needs both SNAT and
>>> DNAT manipulation, due to e.g. tuple collision. Netfilter can support
>>> this because it runs through the NAT table twice - once on ingress and
>>> again after egress. The act_ct action doesn't have such capability.
>>>
>>> Like netfilter hook infrastructure, we should run through NAT twice to
>>> keep the symmetry.
>>>
>>> Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
>>>
>>> Signed-off-by: Aaron Conole <aconole@...hat.com>
>>> ---
>>> net/sched/act_ct.c | 13 ++++++++++++-
>>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>>> index fcc46025e790..f3232a00970f 100644
>>> --- a/net/sched/act_ct.c
>>> +++ b/net/sched/act_ct.c
>>> @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>>> bool commit)
>>> {
>>> #if IS_ENABLED(CONFIG_NF_NAT)
>>> + int err;
>>> enum nf_nat_manip_type maniptype;
>>>
>>> if (!(ct_action & TCA_CT_ACT_NAT))
>>> @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>>> return NF_ACCEPT;
>>> }
>>>
>>> - return ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>>> + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>>> + if (err == NF_ACCEPT &&
>>> + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
>>> + if (maniptype == NF_NAT_MANIP_SRC)
>>> + maniptype = NF_NAT_MANIP_DST;
>>> + else
>>> + maniptype = NF_NAT_MANIP_SRC;
>>> +
>>> + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>>> + }
>>> + return err;
>>> #else
>>> return NF_ACCEPT;
>>> #endif
>>>
>> +paul
>
> Hi Aaron,
>
> I think I understand the issue and this looks good,
>
> Can you describe the scenario to reproduce this?
It reproduces with OpenShift 3.10, which makes forward direction packets
between namespaces pump through a tun device that applies NAT rules to
rewrite the dest. Limit the namespace number of ephemeral sockets using
by editing net.ipv4.ip_local_port_range in the client namespace, and
connect to the server namespace. That's the mechanism for OvS. But for
TC I guess there wouldn't be anything convenient avaiable.
I'll try to script up something that doesn't use openshift.
>
> Thanks,
>
> Paul.
Powered by blists - more mailing lists