netdev - Re: INFO: rcu detected stall in br_handle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <30e6a8c6-b857-00b8-24d8-076b92409636@gmail.com>
Date:   Sat, 28 Dec 2019 07:01:43 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Florian Westphal <fw@...len.de>,
        syzbot <syzbot+dc9071cc5a85950bdfce@...kaller.appspotmail.com>
Cc:     davem@...emloft.net, jhs@...atatu.com, jiri@...nulli.us,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com, xiyou.wangcong@...il.com,
        eric.dumazet@...il.com
Subject: Re: INFO: rcu detected stall in br_handle_frame (2)



On 12/28/19 3:15 AM, Florian Westphal wrote:
> syzbot <syzbot+dc9071cc5a85950bdfce@...kaller.appspotmail.com> wrote:
> 
> [ CC Eric, fq related ]
> 
>> syzbot found the following crash on:
>>
>> HEAD commit:    7e0165b2 Merge branch 'akpm' (patches from Andrew)
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=116ec09ee00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=1b59a3066828ac4c
>> dashboard link: https://syzkaller.appspot.com/bug?extid=dc9071cc5a85950bdfce
>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=159182c1e00000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1221218ee00000
>>
>> Bisection is inconclusive: the bug happens on the oldest tested release.
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=158224c1e00000
>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=178224c1e00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=138224c1e00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+dc9071cc5a85950bdfce@...kaller.appspotmail.com
>>
>> rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>> 	(detected by 0, t=10502 jiffies, g=10149, q=201)
>> rcu: All QSes seen, last rcu_preempt kthread activity 10502
>> (4294978441-4294967939), jiffies_till_next_fqs=1, root ->qsmask 0x0
>> sshd            R  running task    26584 10034   9965 0x00000008
>> Call Trace:
>>  <IRQ>
>>  sched_show_task kernel/sched/core.c:5954 [inline]
> [..]
> 
> The reproducer sets up 'fq' sched with TCA_FQ_QUANTUM == 0x80000000
> 
> This causes infinite loop in fq_dequeue:
> 
> if (f->credit <= 0) {
>   f->credit += q->quantum;
>   goto begin;
> }
> 
> ... because f->credit is either 0 or -2147483648.
> 
> Eric, what is a 'sane' ->quantum value?
> 
> One could simply add a 'quantum > 0 && quantum < INT_MAX'
> constraint afaics.
> 
> If you don't have a better idea/suggestion for an upperlimit INT_MAX
> would be enough to prevent perpetual <= 0 condition.
> 

Thanks Florian for the analysis.

I guess we could use a conservative upper bound value of (1 << 20)
( about 16 64KB packets )

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index ff4c5e9d0d7778d86f20f4bd67cc627eed0713d9..12f1d1c6044fac9db987f7ce3a50a7e2c711358b 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -786,15 +786,20 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt,
        if (tb[TCA_FQ_QUANTUM]) {
                u32 quantum = nla_get_u32(tb[TCA_FQ_QUANTUM]);
 
-               if (quantum > 0)
+               if (quantum > 0 && quantum <= (1 << 20))
                        q->quantum = quantum;
                else
                        err = -EINVAL;
        }
 
-       if (tb[TCA_FQ_INITIAL_QUANTUM])
-               q->initial_quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
+       if (tb[TCA_FQ_INITIAL_QUANTUM]) {
+               u32 quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
 
+               if (quantum > 0 && quantum <= (1 << 20))
+                       q->initial_quantum = quantum;
+               else
+                       err = -EINVAL;
+       }
        if (tb[TCA_FQ_FLOW_DEFAULT_RATE])
                pr_warn_ratelimited("sch_fq: defrate %u ignored.\n",
                                    nla_get_u32(tb[TCA_FQ_FLOW_DEFAULT_RATE]));