[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADvbK_fyWvEH2vPmS5nkwpGBmr+pT3anLM5EkwLr7KLpjuk0qQ@mail.gmail.com>
Date: Tue, 4 Mar 2025 12:10:14 -0500
From: Xin Long <lucien.xin@...il.com>
To: Jianbo Liu <jianbol@...dia.com>
Cc: network dev <netdev@...r.kernel.org>, dev@...nvswitch.org, ovs-dev@...nvswitch.org,
davem@...emloft.net, kuba@...nel.org, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Pravin B Shelar <pshelar@....org>, Ilya Maximets <i.maximets@....org>,
Aaron Conole <aconole@...hat.com>, Florian Westphal <fw@...len.de>
Subject: Re: [PATCHv2 net-next] openvswitch: switch to per-action label
counting in conntrack
On Mon, Mar 3, 2025 at 8:20 PM Jianbo Liu <jianbol@...dia.com> wrote:
>
>
>
> On 3/4/2025 12:42 AM, Xin Long wrote:
> > On Sun, Mar 2, 2025 at 9:14 PM Jianbo Liu <jianbol@...dia.com> wrote:
> >>
> >>
> >>
> >> On 3/3/2025 2:22 AM, Xin Long wrote:
> >>> On Tue, Feb 25, 2025 at 9:57 AM Xin Long <lucien.xin@...il.com> wrote:
> >>>>
> >>>> On Mon, Feb 24, 2025 at 8:38 PM Jianbo Liu <jianbol@...dia.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 2/25/2025 3:55 AM, Xin Long wrote:
> >>>>>> On Mon, Feb 24, 2025 at 4:01 AM Jianbo Liu <jianbol@...dia.com> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 8/13/2024 1:17 AM, Xin Long wrote:
> >>>>>>>> Similar to commit 70f06c115bcc ("sched: act_ct: switch to per-action
> >>>>>>>> label counting"), we should also switch to per-action label counting
> >>>>>>>> in openvswitch conntrack, as Florian suggested.
> >>>>>>>>
> >>>>>>>> The difference is that nf_connlabels_get() is called unconditionally
> >>>>>>>> when creating an ct action in ovs_ct_copy_action(). As with these
> >>>>>>>> flows:
> >>>>>>>>
> >>>>>>>> table=0,ip,actions=ct(commit,table=1)
> >>>>>>>> table=1,ip,actions=ct(commit,exec(set_field:0xac->ct_label),table=2)
> >>>>>>>>
> >>>>>>>> it needs to make sure the label ext is created in the 1st flow before
> >>>>>>>> the ct is committed in ovs_ct_commit(). Otherwise, the warning in
> >>>>>>>> nf_ct_ext_add() when creating the label ext in the 2nd flow will
> >>>>>>>> be triggered:
> >>>>>>>>
> >>>>>>>> WARN_ON(nf_ct_is_confirmed(ct));
> >>>>>>>>
> >>>>>>>
> >>>>>>> Hi Xin Long,
> >>>>>>>
> >>>>>>> The ct can be committed before openvswitch handles packets with CT
> >>>>>>> actions. And we can trigger the warning by creating VF and running ping
> >>>>>>> over it with the following configurations:
> >>>>>>>
> >>>>>>> ovs-vsctl add-br br
> >>>>>>> ovs-vsctl add-port br eth2
> >>>>>>> ovs-vsctl add-port br eth4
> >>>>>>> ovs-ofctl add-flow br "table=0, in_port=eth4,ip,ct_state=-trk
> >>>>>>> actions=ct(table=1)"
> >>>>>>> ovs-ofctl add-flow br "table=1, in_port=eth4,ip,ct_state=+trk+new
> >>>>>>> actions=ct(exec(set_field:0xef7d->ct_label), commit), output:eth2"
> >>>>>>> ovs-ofctl add-flow br "table=1, in_port=eth4,ip,ct_label=0xef7d,
> >>>>>>> ct_state=+trk+est actions=output:eth2"
> >>>>>>>
> >>>>>>> The eth2 is PF, and eth4 is VF's representor.
> >>>>>>> Would you like to fix it?
> >>>>>> Hi, Jianbo,
> >>>>>>
> >>>>>> Sure, we have to attach a new ct to the skb in __ovs_ct_lookup() for
> >>>>>> this case, and even delete the one created before ovs_ct.
> >>>>>>
> >>>>>> Can you check if this works on your env?
> >>>>>
> >>>>> Yes, it works.
> >>>>> Could you please submit the formal patch? Thanks!
> >>>> Great, I will post after running some of my local tests.
> >>>>
> >>> Hi Jianbo,
> >>>
> >>> I recently ran some tests and observed that the current approach cannot
> >>> completely avoid the warning. If an skb enters __ovs_ct_lookup() without
> >>> an attached connection tracking (ct) entry, it may still acquire an
> >>> existing ct created outside of OVS (possibly by netfilter) through
> >>> nf_conntrack_in(). This will trigger the warning in ovs_ct_set_labels().
> >>>
> >>> Deleting a ct created outside OVS and creating a new one within
> >>> __ovs_ct_lookup() doesn't seem reasonable and would be difficult to
> >>
> >> Yes, I'm also skeptical of your temporary fix, and waiting for your
> >> formal one.
> > Cool.
> >
> >>
> >>> implement. However, since OVS is not supposed to use ct entries created
> >>> externally, I believe ct zones can be used to prevent this issue.
> >>> In your case, the following flows should work:
> >>>
> >>> ovs-ofctl add-flow br "table=0, in_port=eth4,ip,ct_state=-trk
> >>> actions=ct(table=1,zone=1)"
> >>> ovs-ofctl add-flow br "table=1,
> >>> in_port=eth4,ip,ct_state=+trk+new,ct_zone=1
> >>> actions=ct(exec(set_field:0xef7d->ct_label),commit,zone=1),
> >>> output:eth2"
> >>> ovs-ofctl add-flow br "table=1,
> >>> in_port=eth4,ip,ct_label=0xef7d,ct_state=+trk+est,ct_zone=1
> >>> actions=output:eth2"
> >>>
> >>> Regarding the warning triggered by externally created ct entries, I plan
> >>> to remove the ovs_ct_get_conn_labels() call from ovs_ct_set_labels() and
> >>> I'll let nf_connlabels_replace() return an error in such cases, similar
> >>> to how tcf_ct_act_set_labels() handles this scenario in tc act_ct.
> >>>
> >>
> >> It's a good idea to be consistent with act_ct implementation. But, would
> >> you like to revert first if it takes long time to work on the fix?
> > Sorry, revert which one?
>
> Of course the one we are currently replying to - "openvswitch: switch to
> per-action label counting in conntrack".
>
It's true the commit "openvswitch: switch to per-action label counting
in conntrack"
makes the issue more likely to occur.
However, I've reproduced the warning in local tests even without the
commit, so the root cause appears to be independent of this change.
I will post the fix later today.
Thank you for your report!
> > If you mean the fix in skb_nfct_cached(), it hasn't been posted and
> > will not be posted.
>
> No. I think we both know this is just a temporary fix, and it can help
> you to understand the issue.
>
> >
> > Thanks.
> >>
> >> Thanks!
> >>
> >>> Thanks.
> >>>>>
> >>>>>>
> >>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> >>>>>> index 3bb4810234aa..c599ee013dfe 100644
> >>>>>> --- a/net/openvswitch/conntrack.c
> >>>>>> +++ b/net/openvswitch/conntrack.c
> >>>>>> @@ -595,6 +595,11 @@ static bool skb_nfct_cached(struct net *net,
> >>>>>> rcu_dereference(timeout_ext->timeout))
> >>>>>> return false;
> >>>>>> }
> >>>>>> + if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) && !nf_ct_labels_find(ct)) {
> >>>>>> + if (nf_ct_is_confirmed(ct))
> >>>>>> + nf_ct_delete(ct, 0, 0);
> >>>>>> + return false;
> >>>>>> + }
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Jianbo
> >>>>>>>
> >>>>>>>> Signed-off-by: Xin Long <lucien.xin@...il.com>
> >>>>>>>> ---
> >>>>>>>> v2: move ovs_net into #if in ovs_ct_exit() as Jakub noticed.
> >>>>>>>> ---
> >>>>>>>> net/openvswitch/conntrack.c | 30 ++++++++++++------------------
> >>>>>>>> net/openvswitch/datapath.h | 3 ---
> >>>>>>>> 2 files changed, 12 insertions(+), 21 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> >>>>>>>> index 8eb1d644b741..a3da5ee34f92 100644
> >>>>>>>> --- a/net/openvswitch/conntrack.c
> >>>>>>>> +++ b/net/openvswitch/conntrack.c
> >>>>>>>> @@ -1368,11 +1368,8 @@ bool ovs_ct_verify(struct net *net, enum ovs_key_attr attr)
> >>>>>>>> attr == OVS_KEY_ATTR_CT_MARK)
> >>>>>>>> return true;
> >>>>>>>> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> >>>>>>>> - attr == OVS_KEY_ATTR_CT_LABELS) {
> >>>>>>>> - struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
> >>>>>>>> -
> >>>>>>>> - return ovs_net->xt_label;
> >>>>>>>> - }
> >>>>>>>> + attr == OVS_KEY_ATTR_CT_LABELS)
> >>>>>>>> + return true;
> >>>>>>>>
> >>>>>>>> return false;
> >>>>>>>> }
> >>>>>>>> @@ -1381,6 +1378,7 @@ int ovs_ct_copy_action(struct net *net, const struct nlattr *attr,
> >>>>>>>> const struct sw_flow_key *key,
> >>>>>>>> struct sw_flow_actions **sfa, bool log)
> >>>>>>>> {
> >>>>>>>> + unsigned int n_bits = sizeof(struct ovs_key_ct_labels) * BITS_PER_BYTE;
> >>>>>>>> struct ovs_conntrack_info ct_info;
> >>>>>>>> const char *helper = NULL;
> >>>>>>>> u16 family;
> >>>>>>>> @@ -1409,6 +1407,12 @@ int ovs_ct_copy_action(struct net *net, const struct nlattr *attr,
> >>>>>>>> return -ENOMEM;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> + if (nf_connlabels_get(net, n_bits - 1)) {
> >>>>>>>> + nf_ct_tmpl_free(ct_info.ct);
> >>>>>>>> + OVS_NLERR(log, "Failed to set connlabel length");
> >>>>>>>> + return -EOPNOTSUPP;
> >>>>>>>> + }
> >>>>>>>> +
> >>>>>>>> if (ct_info.timeout[0]) {
> >>>>>>>> if (nf_ct_set_timeout(net, ct_info.ct, family, key->ip.proto,
> >>>>>>>> ct_info.timeout))
> >>>>>>>> @@ -1577,6 +1581,7 @@ static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info)
> >>>>>>>> if (ct_info->ct) {
> >>>>>>>> if (ct_info->timeout[0])
> >>>>>>>> nf_ct_destroy_timeout(ct_info->ct);
> >>>>>>>> + nf_connlabels_put(nf_ct_net(ct_info->ct));
> >>>>>>>> nf_ct_tmpl_free(ct_info->ct);
> >>>>>>>> }
> >>>>>>>> }
> >>>>>>>> @@ -2002,17 +2007,9 @@ struct genl_family dp_ct_limit_genl_family __ro_after_init = {
> >>>>>>>>
> >>>>>>>> int ovs_ct_init(struct net *net)
> >>>>>>>> {
> >>>>>>>> - unsigned int n_bits = sizeof(struct ovs_key_ct_labels) * BITS_PER_BYTE;
> >>>>>>>> +#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> >>>>>>>> struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
> >>>>>>>>
> >>>>>>>> - if (nf_connlabels_get(net, n_bits - 1)) {
> >>>>>>>> - ovs_net->xt_label = false;
> >>>>>>>> - OVS_NLERR(true, "Failed to set connlabel length");
> >>>>>>>> - } else {
> >>>>>>>> - ovs_net->xt_label = true;
> >>>>>>>> - }
> >>>>>>>> -
> >>>>>>>> -#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> >>>>>>>> return ovs_ct_limit_init(net, ovs_net);
> >>>>>>>> #else
> >>>>>>>> return 0;
> >>>>>>>> @@ -2021,12 +2018,9 @@ int ovs_ct_init(struct net *net)
> >>>>>>>>
> >>>>>>>> void ovs_ct_exit(struct net *net)
> >>>>>>>> {
> >>>>>>>> +#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> >>>>>>>> struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
> >>>>>>>>
> >>>>>>>> -#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> >>>>>>>> ovs_ct_limit_exit(net, ovs_net);
> >>>>>>>> #endif
> >>>>>>>> -
> >>>>>>>> - if (ovs_net->xt_label)
> >>>>>>>> - nf_connlabels_put(net);
> >>>>>>>> }
> >>>>>>>> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> >>>>>>>> index 9ca6231ea647..365b9bb7f546 100644
> >>>>>>>> --- a/net/openvswitch/datapath.h
> >>>>>>>> +++ b/net/openvswitch/datapath.h
> >>>>>>>> @@ -160,9 +160,6 @@ struct ovs_net {
> >>>>>>>> #if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> >>>>>>>> struct ovs_ct_limit_info *ct_limit_info;
> >>>>>>>> #endif
> >>>>>>>> -
> >>>>>>>> - /* Module reference for configuring conntrack. */
> >>>>>>>> - bool xt_label;
> >>>>>>>> };
> >>>>>>>>
> >>>>>>>> /**
> >>>>>>>
> >>>>>
> >>
>
Powered by blists - more mailing lists