netdev - Re: [Patch net] net_sched: remove tcf_block_put

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM_iQpUn+Z1DMAYnOuRw81pRPBaQ15q9jN-vH-VyVp_Vg9u9RA@mail.gmail.com>
Date:   Tue, 31 Oct 2017 14:47:48 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: [Patch net] net_sched: remove tcf_block_put_deferred()

On Tue, Oct 31, 2017 at 3:40 AM, Jiri Pirko <jiri@...nulli.us> wrote:
> Mon, Oct 30, 2017 at 07:10:09PM CET, xiyou.wangcong@...il.com wrote:
>>In commit 7aa0045dadb6 ("net_sched: introduce a workqueue for RCU callbacks of tc filter")
>>I defer tcf_chain_flush() to a workqueue, this causes a use-after-free
>>because qdisc is already destroyed after we queue this work.
>>
>>The tcf_block_put_deferred() is no longer necessary after we get RTNL
>>for each tc filter destroy work, no others could jump in at this point.
>>Same for tcf_chain_hold(), we are fully serialized now.
>>
>>This also reduces one indirection therefore makes the code more
>>readable. Note this brings back a rcu_barrier(), however comparing
>>to the code prior to commit 7aa0045dadb6 we still reduced one
>>rcu_barrier(). For net-next, we can consider to refcnt tcf block to
>>avoid it.
>>
>>Fixes: 7aa0045dadb6 ("net_sched: introduce a workqueue for RCU callbacks of tc filter")
>>Cc: Daniel Borkmann <daniel@...earbox.net>
>>Cc: Jiri Pirko <jiri@...nulli.us>
>>Cc: John Fastabend <john.fastabend@...il.com>
>>Cc: Jamal Hadi Salim <jhs@...atatu.com>
>>Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
>>Cc: Eric Dumazet <edumazet@...gle.com>
>>Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
>>---
>> net/sched/cls_api.c | 37 ++++++++-----------------------------
>> 1 file changed, 8 insertions(+), 29 deletions(-)
>>
>>diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>index 231181c602ed..b2d310745487 100644
>>--- a/net/sched/cls_api.c
>>+++ b/net/sched/cls_api.c
>>@@ -280,8 +280,8 @@ static void tcf_block_put_final(struct work_struct *work)
>>       struct tcf_block *block = container_of(work, struct tcf_block, work);
>>       struct tcf_chain *chain, *tmp;
>>
>>-      /* At this point, all the chains should have refcnt == 1. */
>>       rtnl_lock();
>>+      /* Only chain 0 should be still here. */
>>       list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
>>               tcf_chain_put(chain);
>>       rtnl_unlock();
>>@@ -289,23 +289,17 @@ static void tcf_block_put_final(struct work_struct *work)
>> }
>>
>> /* XXX: Standalone actions are not allowed to jump to any chain, and bound
>>- * actions should be all removed after flushing. However, filters are destroyed
>>- * in RCU callbacks, we have to hold the chains first, otherwise we would
>>- * always race with RCU callbacks on this list without proper locking.
>>+ * actions should be all removed after flushing. However, filters are now
>>+ * destroyed in tc filter workqueue with RTNL lock, they can not race here.
>>  */
>>-static void tcf_block_put_deferred(struct work_struct *work)
>>+void tcf_block_put(struct tcf_block *block)
>> {
>>-      struct tcf_block *block = container_of(work, struct tcf_block, work);
>>-      struct tcf_chain *chain;
>>+      struct tcf_chain *chain, *tmp;
>>
>>-      rtnl_lock();
>>-      /* Hold a refcnt for all chains, except 0, in case they are gone. */
>>-      list_for_each_entry(chain, &block->chain_list, list)
>>-              if (chain->index)
>>-                      tcf_chain_hold(chain);
>>+      if (!block)
>>+              return;
>>
>>-      /* No race on the list, because no chain could be destroyed. */
>>-      list_for_each_entry(chain, &block->chain_list, list)
>>+      list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
>>               tcf_chain_flush(chain);
>
> The reason for the hold above was to avoid use after free in this loop.
> Consider following example:
>
> chain1
>   1 filter with action goto_chain 2
> chain2
>   empty
>
> Now in your list_for_each_entry_safe loop, chain1 is flushed, action is
> removed and chain is put:
> tcf_action_goto_chain_fini->tcf_chain_put(2)
>
> Given the fact chain2 is empty, this put would lead to tcf_chain_destroy(2)
>
> Then in another iteration of list_for_each_entry_safe you are using
> already freed chain.
>
> Am I missing something that prevents this?

Actions are destroyed in filter's ->destroy(), and all filters now
use RCU callback+workqueue to defer the tcf_action_goto_chain_fini()
and with RTNL, so this function won't be executed until we release
RTNL. This is what I mean by "fully serialized".

Or if there any other case I still miss?