[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1229b966-7772-44bd-6e91-fbde213ceb2d@nvidia.com>
Date: Mon, 1 Feb 2021 09:53:18 +0200
From: Roi Dayan <roid@...dia.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
CC: <netdev@...r.kernel.org>, Paul Blakey <paulb@...dia.com>,
Oz Shlomo <ozsh@...dia.com>, <fw@...len.de>
Subject: Re: [PATCH net 1/1] netfilter: conntrack: Check offload bit on table
dump
On 2021-02-01 5:08 AM, Pablo Neira Ayuso wrote:
> Hi Roi,
>
> On Sun, Jan 31, 2021 at 03:18:34PM +0200, Roi Dayan wrote:
> [...]
>> Hi Pablo,
>>
>> We did more tests with just updating the timeout in the 2 callers
>> and it's not enough. We reproduce the issue of rules being timed
>> out just now frim different place.
>
> Thanks for giving it a try to my suggestion, it was not correct.
>
>> There is a 3rd caller nf_ct_gc_expired() which being called by 3
>> other callers:
>> ____nf_conntrack_find()
>> nf_conntrack_tuple_taken()
>> early_drop_list()
>
> Hm. I'm not sure yet what path is triggering this bug.
>
> Florian came up with the idea of setting a very large timeout for
> offloaded flows (that are refreshed by the garbage collector) to avoid
> the extra check from the packet path, so those 3 functions above never
> hit the garbage collection path. This also applies for the ctnetlink
> (conntrack -L) and the /proc/net/nf_conntrack sysctl paths that the
> patch describes, those should not ever see an offloaded flow with a
> small timeout.
>
> nf_ct_offload_timeout() is called from:
>
> #1 flow_offload_add() to set a very large timer.
> #2 the garbage collector path, to refresh the timeout the very large
> offload timer.
>
> Probably there is a race between setting the IPS_OFFLOAD and when
> flow_offload_add() is called? Garbage collector gets in between and
> zaps the connection. Is a newly offloaded connection that you observed
> that is being removed?
>
yes. the flows being removed are newly offloaded connections.
I don't think the race is between setting IPS_OFFLOAD and calling
flow_offload_add().
In act_ct.c tcf_ct_flow_table_add() we first set the bit
and then call flow_offload_add().
I think the race is more of getting into the other flows and gc still
didn't kick in.
I'm sure of this because we added trace dump in nf_ct_delete()
and while creating connections we also ran in a loop conntrack -L.
In the same way instead of conntrack we did the dump from
/proc/net/nf_conntrack and we also saw the trace from there.
In this scenario if we stopped the loop of the dump we didn't crash
and then in more tests we failed again after I tried to moving the bit
check to your suggestion.
But leaving the bit check as in this commit we didn't reproduce the
issue at all.
>> only early_drop_list() has a check to skip conns with offload bit
>> but without extending the timeout.
>> I didnt do a dump but the issue could be from the other 2 calls.
>>
>> With current commit as is I didn't need to check more callers as I made
>> sure all callers will skip the non-offload gc.
>>
>> Instead of updating more callers and there might be more callers
>> later why current commit is not enough?
>> We skip offloaded flows and soon gc_worker() will hit and will update
>> the timeout anyway.
>
> Another possibility would be to check for the offload bit from
> nf_ct_is_expired(), which is coming slighty before nf_ct_should_gc().
> But this is also in the ____nf_conntrack_find() path.
> > Florian?
>
Right. beside the dumps paths that just call nf_ct_should_gc() which
calls nf_ct_is_expired() again.
Moving the bit check into nf_ct_is_expired() should work the same as
this commit.
I'll wait for a final decision if you prefer the bit check in
nf_ct_is_expired() and we will test again.
Powered by blists - more mailing lists