linux-kernel - Re: [PATCH net v2 RESEND] netfilter: fix conntrack flows stuck issue on cleanup.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <SJ0PR18MB4009934E6C87505AD1A7349FB2609@SJ0PR18MB4009.namprd18.prod.outlook.com>
Date:   Tue, 23 Nov 2021 16:35:51 +0000
From:   "Volodymyr Mytnyk [C]" <vmytnyk@...vell.com>
To:     Pablo Neira Ayuso <pablo@...filter.org>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Mickey Rachamim <mickeyr@...vell.com>,
        Serhiy Pshyk <serhiy.pshyk@...ision.eu>,
        Taras Chornyi <taras.chornyi@...ision.eu>,
        Jozsef Kadlecsik <kadlec@...filter.org>,
        Florian Westphal <fw@...len.de>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paul Blakey <paulb@...lanox.com>,
        "netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>,
        "coreteam@...filter.org" <coreteam@...filter.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net v2 RESEND] netfilter: fix conntrack flows stuck issue
 on cleanup.

> Hi,
> 
> On Wed, Nov 03, 2021 at 11:31:36AM +0200, Volodymyr Mytnyk wrote:
> > From: Volodymyr Mytnyk <vmytnyk@...vell.com>
> > 
> > On busy system with big number (few thousands) of HW offloaded flows, it
> > is possible to hit the situation, where some of the conntack flows are
> > stuck in conntrack table (as offloaded) and cannot be removed by user.
> > 
> > This behaviour happens if user has configured conntack using tc sub-system,
> > offloaded those flows for HW and then deleted tc configuration from Linux
> > system by deleting the tc qdiscs.
> > 
> > When qdiscs are removed, the nf_flow_table_free() is called to do the
> > cleanup of HW offloaded flows in conntrack table.
> > 
> > ...
> > process_one_work
> >   tcf_ct_flow_table_cleanup_work()
> >     nf_flow_table_free()
> > 
> > The nf_flow_table_free() does the following things:
> > 
> >   1. cancels gc workqueue
> >   2. marks all flows as teardown
> >   3. executes nf_flow_offload_gc_step() once for each flow to
> >      trigger correct teardown flow procedure (e.g., allocate
> >      work to delete the HW flow and marks the flow as "dying").
> >   4. waits for all scheduled flow offload works to be finished.
> >   5. executes nf_flow_offload_gc_step() once for each flow to
> >      trigger the deleting of flows.
> > 
> > Root cause:
> > 
> > In step 3, nf_flow_offload_gc_step() expects to move flow to "dying"
> > state by using nf_flow_offload_del() and deletes the flow in next
> > nf_flow_offload_gc_step() iteration. But, if flow is in "pending" state
> > for some reason (e.g., reading HW stats), it will not be moved to
> > "dying" state as expected by nf_flow_offload_gc_step() and will not
> > be marked as "dead" for delition.
> > 
> > In step 5, nf_flow_offload_gc_step() assumes that all flows marked
> > as "dead" and will be deleted by this call, but this is not true since
> > the state was not set diring previous nf_flow_offload_gc_step()
> > call.
> > 
> > It issue causes some of the flows to get stuck in connection tracking
> > system or not release properly.
> > 
> > To fix this problem, add nf_flow_table_offload_flush() call between 2 & 3
> > step, to make sure no other flow offload works will be in "pending" state
> > during step 3.
> 
> Thanks for the detailed report.
> 
> I'm attaching two patches, the first one is a preparation patch. The
> second patch flushes the pending work, then it sets the teardown flag
> to all flows in the flowtable and it forces a garbage collector run to
> queue work to remove the flows from hardware, then it flushes this new
> pending work and (finally) it forces another garbage collector run to
> remove the entry from the software flowtable. Compile-tested only.

Hi Pablo,

	Thanks for reviewing the changes and problem investigation.

I will check the provided patches and will back to you.

Regards,
  Volodymyr