[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58378309.2090101@iogearbox.net>
Date: Fri, 25 Nov 2016 01:17:13 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: David Miller <davem@...emloft.net>, xiyou.wangcong@...il.com
CC: jiri@...nulli.us, roid@...lanox.com, netdev@...r.kernel.org,
jiri@...lanox.com, ogerlitz@...lanox.com, cwang@...pensource.com,
john.fastabend@...il.com, alexei.starovoitov@...il.com
Subject: Re: [PATCH net-next] net/sched: cls_flower: verify root pointer before
dereferncing it
On 11/24/2016 09:25 PM, David Miller wrote:
> From: Cong Wang <xiyou.wangcong@...il.com>
> Date: Tue, 22 Nov 2016 11:28:37 -0800
>
>> On Tue, Nov 22, 2016 at 8:11 AM, Jiri Pirko <jiri@...nulli.us> wrote:
>>> Tue, Nov 22, 2016 at 05:04:11PM CET, daniel@...earbox.net wrote:
>>>> Hmm, I don't think we want to have such an additional test in fast
>>>> path for each and every classifier. Can we think of ways to avoid that?
>>>>
>>>> My question is, since we unlink individual instances from such tp-internal
>>>> lists through RCU and release the instance through call_rcu() as well as
>>>> the head (tp->root) via kfree_rcu() eventually, against what are we protecting
>>>> setting RCU_INIT_POINTER(tp->root, NULL) in ->destroy() callback? Something
>>>> not respecting grace period?
>>>
>>> If you call tp->ops->destroy in call_rcu, you don't have to set tp->root
>>> to null.
>>
>> We do need to respect the grace period if we touch the globally visible
>> data structure tp in tcf_destroy(). Therefore Roi's patch is not fixing the
>> right place.
>
> Another idea is to assign tp->root to a dummy static cls_fl_head object,
> instead of NULL, which we just make sure has an ht.elems value of zero.
>
> This avoids having to touch the fast path and also avoids all of these
> complicated changes being discussed wrt. doing things in call_rcu_bh()
> or whatever.
I'm not sure if setting a dummy object for each affected classifier is
making things better. Are you having this in mind as a target for -net?
We do kfree_rcu() the head (tp->root) and likewise do we kfree_rcu() the
tp immediately after the callback. The head object's lifetime for such
classifiers is thus always bound to the tp lifetime. And besides that,
nothing apart from get() checks whether it's actually really NULL (and
that check in get() is odd anyway; some cls do it, some don't).
I took a deeper look into the history, and found that this was not always
the case. For example, d3fa76ee6b4a ("[NET_SCHED]: cls_basic: fix NULL
pointer dereference") moved tp->root initialization into init() routine,
where before it was part of change(), so get() had to deal with head being
NULL back then, so that was indeed a valid case. Also some classify()
callbacks were checking for head being NULL, so destroy would set it to
NULL, f.e. 47a1a1d4be29 ("pkt_sched: remove unnecessary xchg() in packet
classifiers").
For basic, bpf, flow, flower, matchall, I cooked the below diff which
should be usable and small enough for -net.
The remaining pieces from ->destroy() to ->delete() conversion patch from
Cong, we could later on do in -net-next.
Roi, could you check the below for flower with your setup?
(Btw, matchall is still broken besides this fix. mall_delete() sets the
RCU_INIT_POINTER(head->filter, NULL), so that a mall_delete() plus
mall_destroy() combo doesn't free head->filter twice, but doing that is
racy with mall_classify() where head->filter is dereferenced unchecked.
Requires additional fix.)
net/sched/cls_basic.c | 4 ----
net/sched/cls_bpf.c | 4 ----
net/sched/cls_flow.c | 1 -
net/sched/cls_flower.c | 15 +++++++++++----
net/sched/cls_matchall.c | 1 -
5 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c
index eb219b7..5877f60 100644
--- a/net/sched/cls_basic.c
+++ b/net/sched/cls_basic.c
@@ -62,9 +62,6 @@ static unsigned long basic_get(struct tcf_proto *tp, u32 handle)
struct basic_head *head = rtnl_dereference(tp->root);
struct basic_filter *f;
- if (head == NULL)
- return 0UL;
-
list_for_each_entry(f, &head->flist, link) {
if (f->handle == handle) {
l = (unsigned long) f;
@@ -109,7 +106,6 @@ static bool basic_destroy(struct tcf_proto *tp, bool force)
tcf_unbind_filter(tp, &f->res);
call_rcu(&f->rcu, basic_delete_filter);
}
- RCU_INIT_POINTER(tp->root, NULL);
kfree_rcu(head, rcu);
return true;
}
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index bb1d5a4..0a47ba5 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -292,7 +292,6 @@ static bool cls_bpf_destroy(struct tcf_proto *tp, bool force)
call_rcu(&prog->rcu, __cls_bpf_delete_prog);
}
- RCU_INIT_POINTER(tp->root, NULL);
kfree_rcu(head, rcu);
return true;
}
@@ -303,9 +302,6 @@ static unsigned long cls_bpf_get(struct tcf_proto *tp, u32 handle)
struct cls_bpf_prog *prog;
unsigned long ret = 0UL;
- if (head == NULL)
- return 0UL;
-
list_for_each_entry(prog, &head->plist, link) {
if (prog->handle == handle) {
ret = (unsigned long) prog;
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index e396723..6575aba 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -596,7 +596,6 @@ static bool flow_destroy(struct tcf_proto *tp, bool force)
list_del_rcu(&f->list);
call_rcu(&f->rcu, flow_destroy_filter);
}
- RCU_INIT_POINTER(tp->root, NULL);
kfree_rcu(head, rcu);
return true;
}
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index f6f40fb..f6a7ae0 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -269,6 +269,15 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f)
dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol, &tc);
}
+static void fl_destroy_rcu(struct rcu_head *rcu)
+{
+ struct cls_fl_head *head = container_of(rcu, struct cls_fl_head, rcu);
+
+ if (head->mask_assigned)
+ rhashtable_destroy(&head->ht);
+ kfree(head);
+}
+
static bool fl_destroy(struct tcf_proto *tp, bool force)
{
struct cls_fl_head *head = rtnl_dereference(tp->root);
@@ -282,10 +291,8 @@ static bool fl_destroy(struct tcf_proto *tp, bool force)
list_del_rcu(&f->list);
call_rcu(&f->rcu, fl_destroy_filter);
}
- RCU_INIT_POINTER(tp->root, NULL);
- if (head->mask_assigned)
- rhashtable_destroy(&head->ht);
- kfree_rcu(head, rcu);
+
+ call_rcu(&head->rcu, fl_destroy_rcu);
return true;
}
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 25927b6..f935429 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -114,7 +114,6 @@ static bool mall_destroy(struct tcf_proto *tp, bool force)
call_rcu(&f->rcu, mall_destroy_filter);
}
- RCU_INIT_POINTER(tp->root, NULL);
kfree_rcu(head, rcu);
return true;
}
--
1.9.3
Powered by blists - more mailing lists