lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55AE1939.105@mojatatu.com>
Date:	Tue, 21 Jul 2015 06:04:41 -0400
From:	Jamal Hadi Salim <jhs@...atatu.com>
To:	Alex Gartrell <agartrell@...com>, xiyou.wangcong@...il.com,
	davem@...emloft.net
CC:	netdev@...r.kernel.org, eric.dumazet@...il.com, kernel-team@...com,
	stable@...r.kernel.org
Subject: Re: [PATCH,v2 net] net: sched: validate that class is found in qdisc_tree_decrease_qlen

On 07/20/15 15:40, Alex Gartrell wrote:
> We have an application that invokes tc to delete the root every time the
> config changes. As a result we stress the cleanup code and were seeing the
> following panic:
>
>    crash> bt
>    PID: 630839  TASK: ffff8823c990d280  CPU: 14  COMMAND: "tc"
>     [... snip ...]
>     #8 [ffff8820ceec17a0] page_fault at ffffffff8160a8c2
>        [exception RIP: htb_qlen_notify+24]
>        RIP: ffffffffa0841718  RSP: ffff8820ceec1858  RFLAGS: 00010282
>        RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffff88241747b400
>        RDX: ffff88241747b408  RSI: 0000000000000000  RDI: ffff8811fb27d000
>        RBP: ffff8820ceec1868   R8: ffff88120cdeff24   R9: ffff88120cdeff30
>        R10: 0000000000000bd4  R11: ffffffffa0840919  R12: ffffffffa0843340
>        R13: 0000000000000000  R14: 0000000000000001  R15: ffff8808dae5c2e8
>        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>     #9 [...] qdisc_tree_decrease_qlen at ffffffff81565375
>    #10 [...] fq_codel_dequeue at ffffffffa084e0a0 [sch_fq_codel]
>    #11 [...] fq_codel_reset at ffffffffa084e2f8 [sch_fq_codel]
>    #12 [...] qdisc_destroy at ffffffff81560d2d
>    #13 [...] htb_destroy_class at ffffffffa08408f8 [sch_htb]
>    #14 [...] htb_put at ffffffffa084095c [sch_htb]
>    #15 [...] tc_ctl_tclass at ffffffff815645a3
>    #16 [...] rtnetlink_rcv_msg at ffffffff81552cb0
>    [... snip ...]
>
> To my understanding, the following situation is taking place.
>

>    tc_ctl_tclass

>     -> htb_delete
>       -> class is deleted from clhash
>     -> htb_put
>       -> qdisc_destroy
>         -> fq_codel_reset

=========> this part looks suspicious. Why is reset invoking
a dequeue? Shouldnt a destroy just purge the queue?

>           -> fq_codel_dequeue
>             -> qdidsc_tree_decrease_qlen
>               -> cl = htb_get # returns NULL, removed in htb_delete
>                 -> htb_qlen_notify(sch, NULL) # BOOM
>

It is worrisome to fix the core code for this. The root cause seems to
be codel. Dont have time but in general, reset would be something like:

struct fq_codel_sched_data *q = qdisc_priv(sch);
qdisc_reset(q)

or something along those lines...
But certainly dequeue semantics dont seem right there..

cheers,
jamal



cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ