[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c986ed18-750f-46bc-9f52-5860d834162e@nvidia.com>
Date: Tue, 4 Mar 2025 09:20:17 +0800
From: Jianbo Liu <jianbol@...dia.com>
To: Xin Long <lucien.xin@...il.com>
CC: network dev <netdev@...r.kernel.org>, <dev@...nvswitch.org>,
<ovs-dev@...nvswitch.org>, <davem@...emloft.net>, <kuba@...nel.org>, "Eric
Dumazet" <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, "Pravin B
Shelar" <pshelar@....org>, Ilya Maximets <i.maximets@....org>, Aaron Conole
<aconole@...hat.com>, Florian Westphal <fw@...len.de>
Subject: Re: [PATCHv2 net-next] openvswitch: switch to per-action label
counting in conntrack
On 3/4/2025 12:42 AM, Xin Long wrote:
> On Sun, Mar 2, 2025 at 9:14 PM Jianbo Liu <jianbol@...dia.com> wrote:
>>
>>
>>
>> On 3/3/2025 2:22 AM, Xin Long wrote:
>>> On Tue, Feb 25, 2025 at 9:57 AM Xin Long <lucien.xin@...il.com> wrote:
>>>>
>>>> On Mon, Feb 24, 2025 at 8:38 PM Jianbo Liu <jianbol@...dia.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2/25/2025 3:55 AM, Xin Long wrote:
>>>>>> On Mon, Feb 24, 2025 at 4:01 AM Jianbo Liu <jianbol@...dia.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 8/13/2024 1:17 AM, Xin Long wrote:
>>>>>>>> Similar to commit 70f06c115bcc ("sched: act_ct: switch to per-action
>>>>>>>> label counting"), we should also switch to per-action label counting
>>>>>>>> in openvswitch conntrack, as Florian suggested.
>>>>>>>>
>>>>>>>> The difference is that nf_connlabels_get() is called unconditionally
>>>>>>>> when creating an ct action in ovs_ct_copy_action(). As with these
>>>>>>>> flows:
>>>>>>>>
>>>>>>>> table=0,ip,actions=ct(commit,table=1)
>>>>>>>> table=1,ip,actions=ct(commit,exec(set_field:0xac->ct_label),table=2)
>>>>>>>>
>>>>>>>> it needs to make sure the label ext is created in the 1st flow before
>>>>>>>> the ct is committed in ovs_ct_commit(). Otherwise, the warning in
>>>>>>>> nf_ct_ext_add() when creating the label ext in the 2nd flow will
>>>>>>>> be triggered:
>>>>>>>>
>>>>>>>> WARN_ON(nf_ct_is_confirmed(ct));
>>>>>>>>
>>>>>>>
>>>>>>> Hi Xin Long,
>>>>>>>
>>>>>>> The ct can be committed before openvswitch handles packets with CT
>>>>>>> actions. And we can trigger the warning by creating VF and running ping
>>>>>>> over it with the following configurations:
>>>>>>>
>>>>>>> ovs-vsctl add-br br
>>>>>>> ovs-vsctl add-port br eth2
>>>>>>> ovs-vsctl add-port br eth4
>>>>>>> ovs-ofctl add-flow br "table=0, in_port=eth4,ip,ct_state=-trk
>>>>>>> actions=ct(table=1)"
>>>>>>> ovs-ofctl add-flow br "table=1, in_port=eth4,ip,ct_state=+trk+new
>>>>>>> actions=ct(exec(set_field:0xef7d->ct_label), commit), output:eth2"
>>>>>>> ovs-ofctl add-flow br "table=1, in_port=eth4,ip,ct_label=0xef7d,
>>>>>>> ct_state=+trk+est actions=output:eth2"
>>>>>>>
>>>>>>> The eth2 is PF, and eth4 is VF's representor.
>>>>>>> Would you like to fix it?
>>>>>> Hi, Jianbo,
>>>>>>
>>>>>> Sure, we have to attach a new ct to the skb in __ovs_ct_lookup() for
>>>>>> this case, and even delete the one created before ovs_ct.
>>>>>>
>>>>>> Can you check if this works on your env?
>>>>>
>>>>> Yes, it works.
>>>>> Could you please submit the formal patch? Thanks!
>>>> Great, I will post after running some of my local tests.
>>>>
>>> Hi Jianbo,
>>>
>>> I recently ran some tests and observed that the current approach cannot
>>> completely avoid the warning. If an skb enters __ovs_ct_lookup() without
>>> an attached connection tracking (ct) entry, it may still acquire an
>>> existing ct created outside of OVS (possibly by netfilter) through
>>> nf_conntrack_in(). This will trigger the warning in ovs_ct_set_labels().
>>>
>>> Deleting a ct created outside OVS and creating a new one within
>>> __ovs_ct_lookup() doesn't seem reasonable and would be difficult to
>>
>> Yes, I'm also skeptical of your temporary fix, and waiting for your
>> formal one.
> Cool.
>
>>
>>> implement. However, since OVS is not supposed to use ct entries created
>>> externally, I believe ct zones can be used to prevent this issue.
>>> In your case, the following flows should work:
>>>
>>> ovs-ofctl add-flow br "table=0, in_port=eth4,ip,ct_state=-trk
>>> actions=ct(table=1,zone=1)"
>>> ovs-ofctl add-flow br "table=1,
>>> in_port=eth4,ip,ct_state=+trk+new,ct_zone=1
>>> actions=ct(exec(set_field:0xef7d->ct_label),commit,zone=1),
>>> output:eth2"
>>> ovs-ofctl add-flow br "table=1,
>>> in_port=eth4,ip,ct_label=0xef7d,ct_state=+trk+est,ct_zone=1
>>> actions=output:eth2"
>>>
>>> Regarding the warning triggered by externally created ct entries, I plan
>>> to remove the ovs_ct_get_conn_labels() call from ovs_ct_set_labels() and
>>> I'll let nf_connlabels_replace() return an error in such cases, similar
>>> to how tcf_ct_act_set_labels() handles this scenario in tc act_ct.
>>>
>>
>> It's a good idea to be consistent with act_ct implementation. But, would
>> you like to revert first if it takes long time to work on the fix?
> Sorry, revert which one?
Of course the one we are currently replying to - "openvswitch: switch to
per-action label counting in conntrack".
> If you mean the fix in skb_nfct_cached(), it hasn't been posted and
> will not be posted.
No. I think we both know this is just a temporary fix, and it can help
you to understand the issue.
>
> Thanks.
>>
>> Thanks!
>>
>>> Thanks.
>>>>>
>>>>>>
>>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>>>>>> index 3bb4810234aa..c599ee013dfe 100644
>>>>>> --- a/net/openvswitch/conntrack.c
>>>>>> +++ b/net/openvswitch/conntrack.c
>>>>>> @@ -595,6 +595,11 @@ static bool skb_nfct_cached(struct net *net,
>>>>>> rcu_dereference(timeout_ext->timeout))
>>>>>> return false;
>>>>>> }
>>>>>> + if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) && !nf_ct_labels_find(ct)) {
>>>>>> + if (nf_ct_is_confirmed(ct))
>>>>>> + nf_ct_delete(ct, 0, 0);
>>>>>> + return false;
>>>>>> + }
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Jianbo
>>>>>>>
>>>>>>>> Signed-off-by: Xin Long <lucien.xin@...il.com>
>>>>>>>> ---
>>>>>>>> v2: move ovs_net into #if in ovs_ct_exit() as Jakub noticed.
>>>>>>>> ---
>>>>>>>> net/openvswitch/conntrack.c | 30 ++++++++++++------------------
>>>>>>>> net/openvswitch/datapath.h | 3 ---
>>>>>>>> 2 files changed, 12 insertions(+), 21 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>>>>>>>> index 8eb1d644b741..a3da5ee34f92 100644
>>>>>>>> --- a/net/openvswitch/conntrack.c
>>>>>>>> +++ b/net/openvswitch/conntrack.c
>>>>>>>> @@ -1368,11 +1368,8 @@ bool ovs_ct_verify(struct net *net, enum ovs_key_attr attr)
>>>>>>>> attr == OVS_KEY_ATTR_CT_MARK)
>>>>>>>> return true;
>>>>>>>> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
>>>>>>>> - attr == OVS_KEY_ATTR_CT_LABELS) {
>>>>>>>> - struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
>>>>>>>> -
>>>>>>>> - return ovs_net->xt_label;
>>>>>>>> - }
>>>>>>>> + attr == OVS_KEY_ATTR_CT_LABELS)
>>>>>>>> + return true;
>>>>>>>>
>>>>>>>> return false;
>>>>>>>> }
>>>>>>>> @@ -1381,6 +1378,7 @@ int ovs_ct_copy_action(struct net *net, const struct nlattr *attr,
>>>>>>>> const struct sw_flow_key *key,
>>>>>>>> struct sw_flow_actions **sfa, bool log)
>>>>>>>> {
>>>>>>>> + unsigned int n_bits = sizeof(struct ovs_key_ct_labels) * BITS_PER_BYTE;
>>>>>>>> struct ovs_conntrack_info ct_info;
>>>>>>>> const char *helper = NULL;
>>>>>>>> u16 family;
>>>>>>>> @@ -1409,6 +1407,12 @@ int ovs_ct_copy_action(struct net *net, const struct nlattr *attr,
>>>>>>>> return -ENOMEM;
>>>>>>>> }
>>>>>>>>
>>>>>>>> + if (nf_connlabels_get(net, n_bits - 1)) {
>>>>>>>> + nf_ct_tmpl_free(ct_info.ct);
>>>>>>>> + OVS_NLERR(log, "Failed to set connlabel length");
>>>>>>>> + return -EOPNOTSUPP;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> if (ct_info.timeout[0]) {
>>>>>>>> if (nf_ct_set_timeout(net, ct_info.ct, family, key->ip.proto,
>>>>>>>> ct_info.timeout))
>>>>>>>> @@ -1577,6 +1581,7 @@ static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info)
>>>>>>>> if (ct_info->ct) {
>>>>>>>> if (ct_info->timeout[0])
>>>>>>>> nf_ct_destroy_timeout(ct_info->ct);
>>>>>>>> + nf_connlabels_put(nf_ct_net(ct_info->ct));
>>>>>>>> nf_ct_tmpl_free(ct_info->ct);
>>>>>>>> }
>>>>>>>> }
>>>>>>>> @@ -2002,17 +2007,9 @@ struct genl_family dp_ct_limit_genl_family __ro_after_init = {
>>>>>>>>
>>>>>>>> int ovs_ct_init(struct net *net)
>>>>>>>> {
>>>>>>>> - unsigned int n_bits = sizeof(struct ovs_key_ct_labels) * BITS_PER_BYTE;
>>>>>>>> +#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>>>>>>> struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
>>>>>>>>
>>>>>>>> - if (nf_connlabels_get(net, n_bits - 1)) {
>>>>>>>> - ovs_net->xt_label = false;
>>>>>>>> - OVS_NLERR(true, "Failed to set connlabel length");
>>>>>>>> - } else {
>>>>>>>> - ovs_net->xt_label = true;
>>>>>>>> - }
>>>>>>>> -
>>>>>>>> -#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>>>>>>> return ovs_ct_limit_init(net, ovs_net);
>>>>>>>> #else
>>>>>>>> return 0;
>>>>>>>> @@ -2021,12 +2018,9 @@ int ovs_ct_init(struct net *net)
>>>>>>>>
>>>>>>>> void ovs_ct_exit(struct net *net)
>>>>>>>> {
>>>>>>>> +#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>>>>>>> struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
>>>>>>>>
>>>>>>>> -#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>>>>>>> ovs_ct_limit_exit(net, ovs_net);
>>>>>>>> #endif
>>>>>>>> -
>>>>>>>> - if (ovs_net->xt_label)
>>>>>>>> - nf_connlabels_put(net);
>>>>>>>> }
>>>>>>>> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
>>>>>>>> index 9ca6231ea647..365b9bb7f546 100644
>>>>>>>> --- a/net/openvswitch/datapath.h
>>>>>>>> +++ b/net/openvswitch/datapath.h
>>>>>>>> @@ -160,9 +160,6 @@ struct ovs_net {
>>>>>>>> #if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>>>>>>> struct ovs_ct_limit_info *ct_limit_info;
>>>>>>>> #endif
>>>>>>>> -
>>>>>>>> - /* Module reference for configuring conntrack. */
>>>>>>>> - bool xt_label;
>>>>>>>> };
>>>>>>>>
>>>>>>>> /**
>>>>>>>
>>>>>
>>
Powered by blists - more mailing lists