[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180907023529.GV19965@ZenIV.linux.org.uk>
Date: Fri, 7 Sep 2018 03:35:29 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: netdev@...r.kernel.org, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, stable@...r.kernel.org
Subject: Re: [PATCH 1/7] fix hnode refcounting
On Thu, Sep 06, 2018 at 06:21:09AM -0400, Jamal Hadi Salim wrote:
> For networking patches, subject should be reflective of tree and
> subsystem. Example for this one:
> "[PATCH net 1/7]:net: sched: cls_u32: fix hnode refcounting"
> Also useful to have a cover letter summarizing the patchset
> in 0/7. Otherwise
>
> Acked-by: Jamal Hadi Salim <jhs@...atatu.com>
Argh... Unfortunately, there's this: in u32_delete() we have
if (root_ht) {
if (root_ht->refcnt > 1) {
*last = false;
goto ret;
}
if (root_ht->refcnt == 1) {
if (!ht_empty(root_ht)) {
*last = false;
goto ret;
}
}
}
and that would need to be updated. However, that logics is bloody odd
to start with. First of all, root_ht has come from
struct tc_u_hnode *root_ht = rtnl_dereference(tp->root);
and the only place where it's ever modified is
rcu_assign_pointer(tp->root, root_ht);
in u32_init(), where we'd bloody well checked that root_ht is non-NULL
(see
if (root_ht == NULL)
return -ENOBUFS;
upstream of that place) and where that assignment is inevitable on the
way to returning 0. No matter what, if tp has passed u32_init() it
will have non-NULL ->root, forever. And there is no way for tcf_proto
to be seen outside of tcf_proto_create() without ->init() having returned
0 - it gets freed before anyone sees it.
So this 'if (root_ht)' can't be false. What's more, what the hell is the
whole thing checking? We are in u32_delete(). It's called (as ->delete())
from tfilter_del_notify(), which is called from tc_del_tfilter(). If we
return 0 with *last true, we follow up calling tcf_proto_destroy().
OK, let's look at the logics in there:
* if there are links to root hnode => false
* if there's no links to root hnode and it has knodes => false
(BTW, if we ever get there with root_ht->refcnt < 1, we are obviously screwed)
* if there is a tcf_proto sharing tp->data => false (i.e. any filters
with different prio - don't bother)
* if tp is the only one with reference to tp->data and there are *any*
knodes => false.
Any extra links can come only from knodes in a non-empty hnode. And it's not
a common case. Shouldn't thIe whole thing be
* shared tp->data => false
* any non-empty hnode => false
instead? Perhaps even with the knode counter in tp->data, avoiding any loops
in there, as well as the entire ht_empty()...
Now, in the very beginning of u32_delete() we have this:
struct tc_u_hnode *ht = arg;
if (ht == NULL)
goto out;
OK, but the call of ->delete() is
err = tp->ops->delete(tp, fh, last, extack);
and arg == NULL seen in u32_delete() means fh == NULL in tfilter_del_notify().
Which is called in
if (!fh) {
...
} else {
bool last;
err = tfilter_del_notify(net, skb, n, tp, block,
q, parent, fh, false, &last,
extack);
How can we ever get there with NULL fh?
The whole thing makes very little sense; looks like it used to live in
u32_destroy() prior to commit 763dbf6328e41 ("net_sched: move the empty tp
check from ->destroy() to ->delete()"), but looking at the rationale in
that commit... I don't see how it fixes anything - sure, now we remove
tcf_proto from the list before calling ->destroy(). Without any RCU delays
in between. How could it possibly solve any issues with ->classify()
called in parallel with ->destroy()? cls_u32 (at least these days)
does try to survive u32_destroy() in parallel with u32_classify();
if any other classifiers do not, they are still broken and that commit
has not done anything for them.
Anyway, adjusting 1/7 for that is trivial, but I would really like to
understand what that code is doing... Comments?
Powered by blists - more mailing lists