lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 26 Apr 2020 14:46:05 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     syzbot <syzbot+0251e883fe39e7a0cb0a@...kaller.appspotmail.com>,
        David Miller <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        jhs@...atatu.com,
        Jiří Pírko <jiri@...nulli.us>,
        Krzysztof Kozlowski <krzk@...nel.org>, kuba@...nel.org,
        kvalo@...eaurora.org, leon@...nel.org,
        LKML <linux-kernel@...r.kernel.org>,
        linux-kselftest@...r.kernel.org, Netdev <netdev@...r.kernel.org>,
        Shuah Khan <shuah@...nel.org>, syzkaller-bugs@...glegroups.com,
        Thomas Gleixner <tglx@...utronix.de>, vivien.didelot@...il.com,
        Cong Wang <xiyou.wangcong@...il.com>
Subject: Re: INFO: rcu detected stall in wg_packet_tx_worker

On Sun, Apr 26, 2020 at 2:38 PM Eric Dumazet <eric.dumazet@...il.com> wrote:
>
>
>
> On 4/26/20 1:26 PM, Eric Dumazet wrote:
> >
> >
> > On 4/26/20 12:42 PM, Jason A. Donenfeld wrote:
> >> On Sun, Apr 26, 2020 at 1:40 PM Eric Dumazet <eric.dumazet@...il.com> wrote:
> >>>
> >>>
> >>>
> >>> On 4/26/20 10:57 AM, syzbot wrote:
> >>>> syzbot has bisected this bug to:
> >>>>
> >>>> commit e7096c131e5161fa3b8e52a650d7719d2857adfd
> >>>> Author: Jason A. Donenfeld <Jason@...c4.com>
> >>>> Date:   Sun Dec 8 23:27:34 2019 +0000
> >>>>
> >>>>     net: WireGuard secure network tunnel
> >>>>
> >>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000
> >>>> start commit:   b2768df2 Merge branch 'for-linus' of git://git.kernel.org/..
> >>>> git tree:       upstream
> >>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000
> >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000
> >>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68
> >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a
> >>>> userspace arch: i386
> >>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000
> >>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000
> >>>>
> >>>> Reported-by: syzbot+0251e883fe39e7a0cb0a@...kaller.appspotmail.com
> >>>> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
> >>>>
> >>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> >>>>
> >>>
> >>> I have not looked at the repro closely, but WireGuard has some workers
> >>> that might loop forever, cond_resched() might help a bit.
> >>
> >> I'm working on this right now. Having a bit difficult of a time
> >> getting it to reproduce locally...
> >>
> >> The reports show the stall happening always at:
> >>
> >> static struct sk_buff *
> >> sfq_dequeue(struct Qdisc *sch)
> >> {
> >>        struct sfq_sched_data *q = qdisc_priv(sch);
> >>        struct sk_buff *skb;
> >>        sfq_index a, next_a;
> >>        struct sfq_slot *slot;
> >>
> >>        /* No active slots */
> >>        if (q->tail == NULL)
> >>                return NULL;
> >>
> >> next_slot:
> >>        a = q->tail->next;
> >>        slot = &q->slots[a];
> >>
> >> Which is kind of interesting, because it's not like that should block
> >> or anything, unless there's some kasan faulting happening.
> >>
> >
> > I am not really sure WireGuard is involved, the repro does not rely on it anyway.
> >
>
> Yes, do not spend too much time on this.
>
> syzbot found its way into crazy qdisc settings these last days.
>
> ( I sent a patch yesterday for choke qdisc, it seems similar checks are needed in sfq )

Ah, whew, okay. I had just begun instrumenting sfq (the highly
technical term for "adding printks everywhere") to figure out what's
going on. Looks like you've got a handle on it, so I'll let you have
at it.

On the brighter side, it seems like Dmitry's and my effort to get full
coverage of WireGuard has paid off in the sense that tons of packets
wind up being shoveled through it in one way or another, which is
good.

Powered by blists - more mailing lists