lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 19 Jun 2024 14:58:29 +0200
From: Ilya Maximets <i.maximets@....org>
To: Xin Long <lucien.xin@...il.com>
Cc: i.maximets@....org, network dev <netdev@...r.kernel.org>,
 dev@...nvswitch.org, Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
 Jiri Pirko <jiri@...nulli.us>, Davide Caratti <dcaratti@...hat.com>,
 Florian Westphal <fw@...len.de>, Jamal Hadi Salim <jhs@...atatu.com>,
 Eric Dumazet <edumazet@...gle.com>, Cong Wang <xiyou.wangcong@...il.com>,
 kuba@...nel.org, Paolo Abeni <pabeni@...hat.com>, davem@...emloft.net,
 Pablo Neira Ayuso <pablo@...filter.org>, Aaron Conole <aconole@...hat.com>
Subject: Re: [PATCH net-next 3/3] openvswitch: set IPS_CONFIRMED in tmpl
 status only when commit is set in conntrack

On 6/18/24 17:50, Ilya Maximets wrote:
> On 6/18/24 16:58, Xin Long wrote:
>> On Tue, Jun 18, 2024 at 7:34 AM Ilya Maximets <i.maximets@....org> wrote:
>>>
>>> On 6/17/24 22:10, Ilya Maximets wrote:
>>>> On 7/16/23 23:09, Xin Long wrote:
>>>>> By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
>>>>> from the hashtable when lookup, we can simplify the exp processing code a
>>>>> lot in openvswitch conntrack.
>>>>>
>>>>> Signed-off-by: Xin Long <lucien.xin@...il.com>
>>>>> ---
>>>>>  net/openvswitch/conntrack.c | 78 +++++--------------------------------
>>>>>  1 file changed, 10 insertions(+), 68 deletions(-)
>>>>>
>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>>>>> index 331730fd3580..fa955e892210 100644
>>>>> --- a/net/openvswitch/conntrack.c
>>>>> +++ b/net/openvswitch/conntrack.c
>>>>> @@ -455,45 +455,6 @@ static int ovs_ct_handle_fragments(struct net *net, struct sw_flow_key *key,
>>>>>      return 0;
>>>>>  }
>>>>>
>>>>> -static struct nf_conntrack_expect *
>>>>> -ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone *zone,
>>>>> -               u16 proto, const struct sk_buff *skb)
>>>>> -{
>>>>> -    struct nf_conntrack_tuple tuple;
>>>>> -    struct nf_conntrack_expect *exp;
>>>>> -
>>>>> -    if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, &tuple))
>>>>> -            return NULL;
>>>>> -
>>>>> -    exp = __nf_ct_expect_find(net, zone, &tuple);
>>>>> -    if (exp) {
>>>>> -            struct nf_conntrack_tuple_hash *h;
>>>>> -
>>>>> -            /* Delete existing conntrack entry, if it clashes with the
>>>>> -             * expectation.  This can happen since conntrack ALGs do not
>>>>> -             * check for clashes between (new) expectations and existing
>>>>> -             * conntrack entries.  nf_conntrack_in() will check the
>>>>> -             * expectations only if a conntrack entry can not be found,
>>>>> -             * which can lead to OVS finding the expectation (here) in the
>>>>> -             * init direction, but which will not be removed by the
>>>>> -             * nf_conntrack_in() call, if a matching conntrack entry is
>>>>> -             * found instead.  In this case all init direction packets
>>>>> -             * would be reported as new related packets, while reply
>>>>> -             * direction packets would be reported as un-related
>>>>> -             * established packets.
>>>>> -             */
>>>>> -            h = nf_conntrack_find_get(net, zone, &tuple);
>>>>> -            if (h) {
>>>>> -                    struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>>>>> -
>>>>> -                    nf_ct_delete(ct, 0, 0);
>>>>> -                    nf_ct_put(ct);
>>>>> -            }
>>>>> -    }
>>>>> -
>>>>> -    return exp;
>>>>> -}
>>>>> -
>>>>>  /* This replicates logic from nf_conntrack_core.c that is not exported. */
>>>>>  static enum ip_conntrack_info
>>>>>  ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
>>>>> @@ -852,36 +813,16 @@ static int ovs_ct_lookup(struct net *net, struct sw_flow_key *key,
>>>>>                       const struct ovs_conntrack_info *info,
>>>>>                       struct sk_buff *skb)
>>>>>  {
>>>>> -    struct nf_conntrack_expect *exp;
>>>>> -
>>>>> -    /* If we pass an expected packet through nf_conntrack_in() the
>>>>> -     * expectation is typically removed, but the packet could still be
>>>>> -     * lost in upcall processing.  To prevent this from happening we
>>>>> -     * perform an explicit expectation lookup.  Expected connections are
>>>>> -     * always new, and will be passed through conntrack only when they are
>>>>> -     * committed, as it is OK to remove the expectation at that time.
>>>>> -     */
>>>>> -    exp = ovs_ct_expect_find(net, &info->zone, info->family, skb);
>>>>> -    if (exp) {
>>>>> -            u8 state;
>>>>> -
>>>>> -            /* NOTE: New connections are NATted and Helped only when
>>>>> -             * committed, so we are not calling into NAT here.
>>>>> -             */
>>>>> -            state = OVS_CS_F_TRACKED | OVS_CS_F_NEW | OVS_CS_F_RELATED;
>>>>> -            __ovs_ct_update_key(key, state, &info->zone, exp->master);
>>>>
>>>> Hi, Xin, others.
>>>>
>>>> Unfortunately, it seems like removal of this code broke the expected behavior.
>>>> OVS in userspace expects that SYN packet of a new related FTP connection will
>>>> get +new+rel+trk flags, but after this patch we're only getting +rel+trk and not
>>>> new.  This is a problem because we need to commit this connection with the label
>>>> and we do that for +new packets.  If we can't get +new packet we'll have to commit
>>>> every single +rel+trk packet, which doesn't make a lot of sense.  And it's a
>>>> significant behavior change regardless.
>>>
>>> Interestingly enough I see +new+rel+trk packets in cases without SNAT,
>>> but we can only get +rel+trk in cases with SNAT.  So, this may be just
>>> a generic conntrack bug somewhere.  At least the behavior seems fairly
>>> inconsistent.
>>>
>> In nf_conntrack, IP_CT_RELATED and IP_CT_NEW do not exist at the same
>> time. With this patch, we expect OVS_CS_F_RELATED and OVS_CS_F_NEW
>> are set at the same time by ovs_ct_update_key() when this related ct
>> is not confirmed.
>>
>> The check-kernel test of "FTP SNAT orig tuple" skiped on my env somehow:
>>
>> # make check-kernel
>> 144: conntrack - FTP SNAT orig tuple   skipped (system-traffic.at:7295)
>>
>> Any idea why? or do you know any other testcase that expects +new+rel+trk
>> but returns +rel+trk only?
> 
> You need to install lftp and pyftpdlib.  The pyftpdlib may only be available
> via pip on some systems.
> 
>>
>> Thanks.
>>>>
>>>> Could you, please, take a look?
>>>>
>>>> The issue can be reproduced by running check-kernel tests in OVS repo.
>>>> 'FTP SNAT orig tuple' tests fail 100% of the time.
>>>>
>>>> Best regards, Ilya Maximets.
>>>
> 

Hmm.  After further investigation, it seems that the issue is not about ct state,
but the ct label.  Before this commit we had information about both the original
tuple of the parent connection and the mark/label of the parent connection:

system@...-system: miss upcall:
recirc_id(0x2),dp_hash(0),skb_priority(0),in_port(3),skb_mark(0),ct_state(0x25),
ct_zone(0x1),ct_mark(0),ct_label(0x4d2000000000000000000000001),
ct_tuple4(src=10.1.1.1,dst=10.1.1.2,proto=6,tp_src=50648,tp_dst=21),
eth(src=de:d9:f3:c8:5a:3a,dst=80:88:88:88:88:88),eth_type(0x0800),
ipv4(src=10.1.1.2,dst=10.1.1.9,proto=6,tos=0,ttl=64,frag=no),
tcp(src=57027,dst=38153),tcp_flags(syn)

But after this change, we still have the original tuple of the parent connection,
but the label is no longer in the flow key:

system@...-system: miss upcall:
recirc_id(0x2),dp_hash(0),skb_priority(0),in_port(3),skb_mark(0),ct_state(0x25),
ct_zone(0x1),ct_mark(0),ct_label(0),
ct_tuple4(src=10.1.1.1,dst=10.1.1.2,proto=6,tp_src=34668,tp_dst=21),
eth(src=66:eb:74:c6:79:24,dst=80:88:88:88:88:88),eth_type(0x0800),
ipv4(src=10.1.1.2,dst=10.1.1.9,proto=6,tos=0,ttl=64,frag=no),
tcp(src=49529,dst=35459),tcp_flags(syn)

ct_state(0x25) == +new+rel+trk

Best regards, Ilya Maximets.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ