lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180828000310.GE6515@ZenIV.linux.org.uk>
Date:   Tue, 28 Aug 2018 01:03:10 +0100
From:   Al Viro <viro@...IV.linux.org.uk>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Jamal Hadi Salim <jhs@...atatu.com>,
        Kees Cook <keescook@...omium.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Jiri Pirko <jiri@...nulli.us>,
        David Miller <davem@...emloft.net>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH] net: sched: Fix memory exposure from short TCA_U32_SEL

On Mon, Aug 27, 2018 at 02:31:41PM -0700, Cong Wang wrote:
> > I cant think of any challenges. Cong/Jiri? Would it require development
> > time classifiers/actions/qdiscs to sit in that directory (I suspect you
> > dont want them in include/net).
> > BTW, the idea of improving grep-ability of the code by prefixing the
> > ops appropriately makes sense. i.e we should have ops->cls_init,
> > ops->act_init etc.
> 
> Hmm? Isn't struct tcf_proto_ops used and must be provided
> by each tc filter module? How does it work if you move it into
> net/sched/* for out-of-tree modules? Are they supposed to
> include "..../net/sched/tcf_proto.h"?? Or something else?

If you care about out-of-tree modules, that could easily live in
include/net/tcf_proto.h, provided that it's not pulled by indirect
includes into hell knows how many places.  Try
make allmodconfig
make >/dev/null 2>&1
find -name '.*.cmd'|xargs grep sch_generic.h

That finds 2977 files here, most of them having nothing to do with
net/sched.

> BTW, we need some grep tool that really understands C syntax,
> not making each variable friendly to plain grep.

This isn't the matter of C syntax; it needs to handle C typization,
and you really can't do that anywhere near reliably without looking
at preprocessor output.  Which very much depends upon .config...

BTW, something odd in cls_u32.c: what happens if we have the following
graph:
tcf_proto <tp>, it's ->data being <c0> and ->root - <ht0>
tc_u_common <c0>, in its ->hlist
	<ht1>, in its ->ht[0]
		<knode>
	<ht0>
and set ->ht_down in <knode> to the <ht0>?  AFAICS,
there's nothing to prevent that - TCA_U32_LINK being
0x80000000 will do just that.  What happens upon u32_destroy()
in that case?  Unless I'm misreading that code, refcounts will be
	<c0>:	1
	<ht0>:	2
	<ht1>:	1
and in u32_destroy() we'll get this:
	root_ht = <ht0>
	tp_c = <c0>
        if (root_ht && --root_ht->refcnt == 0)
                u32_destroy_hnode(tp, root_ht, extack);
decrements refcnt to 1 and does nothing else.
        if (--tp_c->refcnt == 0) {
is satisfied
                hlist_del(&tp_c->hnode);
<c0> unhashed
                while ((ht = rtnl_dereference(tp_c->hlist)) != NULL) {
we take ht = <ht1>
                        u32_clear_hnode(tp, ht, extack);
which does
        for (h = 0; h <= ht->divisor; h++) {
                while ((n = rtnl_dereference(ht->ht[h])) != NULL) {
n = <knode>
                        RCU_INIT_POINTER(ht->ht[h],
                                         rtnl_dereference(n->next));
remove <knode> from <ht1>->ht[0]
                        tcf_unbind_filter(tp, &n->res);
                        u32_remove_hw_knode(tp, n, extack);
                        idr_remove(&ht->handle_idr, n->handle);
                        if (tcf_exts_get_net(&n->exts))
                                tcf_queue_work(&n->rwork, u32_delete_key_freepf_work);
                        else
                                u32_destroy_key(n->tp, n, true);
... and we hit u32_destroy_key(<tp>, <knode>, true), which does
        struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
ht = <ht0>
        tcf_exts_destroy(&n->exts);
        tcf_exts_put_net(&n->exts);
        if (ht && --ht->refcnt == 0)
                kfree(ht);
*NOW* <ht0>->refcnt is 0, and we free the damn thing.
	....
	kfree(n);
<knode> is freed and we return to u32_destroy_hnode() where we
see that there's nothing else left in <ht1>->ht[...] and return
to u32_destroy().  Where
                        RCU_INIT_POINTER(tp_c->hlist, ht->next);
sets <c0>->hlist to <ht1>->next, aka <h0>.  Which is already freed.

                        /* u32_destroy_key() will later free ht for us, if it's
                         * still referenced by some knode
                         */
                        if (--ht->refcnt == 0)
                                kfree_rcu(ht, rcu);
<ht1>->refcnt reaches 0 and we free it (RCU-delayed)
                }
... and we go for the next iteration, this time with ht = <ht0>.
Doing all kinds of unsanitary things to the memory it used to occupy...

Incidentally, if we hit
                                tcf_queue_work(&n->rwork, u32_delete_key_freepf_work);
instead of u32_destroy_key(), the things don't seem to be any better - we
won't do anything to <knode> until rtnl is dropped, so u32_destroy() won't
break on the second pass through the loop - it'll free <ht0> there and
return.  Setting us up for trouble, since when u32_delete_key_freepf_work()
finally gets to u32_destroy_key() we'll have <knode>->ht_down pointing
to freed memory and decrementing its contents...

What am I missing in there?  Is it just "we should never have ->ht_down
pointing to anyone's ->root"?  If so, I'm not sure how to detect that;
if not... what should happen to the orphaned root_ht?  Should it
remain on the list?  We might have two tcf_proto sharing tp->data,
so tp_c and its list might very well survive the u32_destroy()...

Note, BTW, that if we do leave the orphan on the list and later
change the tc_u_knode so that ->ht_down doesn't point to that
thing anymore, we'll get its refcount incremented to 2 in
u32_init_knode(), then decremented to 1 by u32_set_parms() and
then arrange for u32_delete_key_work() to be run.  Which will 
drive the refcount to 0 and free the damn thing.  While it's
still in the middle of ->hlist...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ