[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ0PR18MB4009934E6C87505AD1A7349FB2609@SJ0PR18MB4009.namprd18.prod.outlook.com>
Date: Tue, 23 Nov 2021 16:35:51 +0000
From: "Volodymyr Mytnyk [C]" <vmytnyk@...vell.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Mickey Rachamim <mickeyr@...vell.com>,
Serhiy Pshyk <serhiy.pshyk@...ision.eu>,
Taras Chornyi <taras.chornyi@...ision.eu>,
Jozsef Kadlecsik <kadlec@...filter.org>,
Florian Westphal <fw@...len.de>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paul Blakey <paulb@...lanox.com>,
"netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>,
"coreteam@...filter.org" <coreteam@...filter.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net v2 RESEND] netfilter: fix conntrack flows stuck issue
on cleanup.
> Hi,
>
> On Wed, Nov 03, 2021 at 11:31:36AM +0200, Volodymyr Mytnyk wrote:
> > From: Volodymyr Mytnyk <vmytnyk@...vell.com>
> >
> > On busy system with big number (few thousands) of HW offloaded flows, it
> > is possible to hit the situation, where some of the conntack flows are
> > stuck in conntrack table (as offloaded) and cannot be removed by user.
> >
> > This behaviour happens if user has configured conntack using tc sub-system,
> > offloaded those flows for HW and then deleted tc configuration from Linux
> > system by deleting the tc qdiscs.
> >
> > When qdiscs are removed, the nf_flow_table_free() is called to do the
> > cleanup of HW offloaded flows in conntrack table.
> >
> > ...
> > process_one_work
> > tcf_ct_flow_table_cleanup_work()
> > nf_flow_table_free()
> >
> > The nf_flow_table_free() does the following things:
> >
> > 1. cancels gc workqueue
> > 2. marks all flows as teardown
> > 3. executes nf_flow_offload_gc_step() once for each flow to
> > trigger correct teardown flow procedure (e.g., allocate
> > work to delete the HW flow and marks the flow as "dying").
> > 4. waits for all scheduled flow offload works to be finished.
> > 5. executes nf_flow_offload_gc_step() once for each flow to
> > trigger the deleting of flows.
> >
> > Root cause:
> >
> > In step 3, nf_flow_offload_gc_step() expects to move flow to "dying"
> > state by using nf_flow_offload_del() and deletes the flow in next
> > nf_flow_offload_gc_step() iteration. But, if flow is in "pending" state
> > for some reason (e.g., reading HW stats), it will not be moved to
> > "dying" state as expected by nf_flow_offload_gc_step() and will not
> > be marked as "dead" for delition.
> >
> > In step 5, nf_flow_offload_gc_step() assumes that all flows marked
> > as "dead" and will be deleted by this call, but this is not true since
> > the state was not set diring previous nf_flow_offload_gc_step()
> > call.
> >
> > It issue causes some of the flows to get stuck in connection tracking
> > system or not release properly.
> >
> > To fix this problem, add nf_flow_table_offload_flush() call between 2 & 3
> > step, to make sure no other flow offload works will be in "pending" state
> > during step 3.
>
> Thanks for the detailed report.
>
> I'm attaching two patches, the first one is a preparation patch. The
> second patch flushes the pending work, then it sets the teardown flag
> to all flows in the flowtable and it forces a garbage collector run to
> queue work to remove the flows from hardware, then it flushes this new
> pending work and (finally) it forces another garbage collector run to
> remove the entry from the software flowtable. Compile-tested only.
Hi Pablo,
Thanks for reviewing the changes and problem investigation.
I will check the provided patches and will back to you.
Regards,
Volodymyr
Powered by blists - more mailing lists