netdev - Re: general protection fault in qdisc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wgZneAegyitz7f+JLjB6=28ewtvT7M4xy_a-wqsTjOX_w@mail.gmail.com>
Date:   Sun, 8 Sep 2019 10:18:42 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     syzbot <syzbot+d5870a903591faaca4ae@...kaller.appspotmail.com>
Cc:     akinobu.mita@...il.com, Andrew Morton <akpm@...ux-foundation.org>,
        David Miller <davem@...emloft.net>,
        Dmitry Vyukov <dvyukov@...gle.com>, jhs@...atatu.com,
        jiri@...nulli.us,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Michal Hocko <mhocko@...nel.org>,
        Netdev <netdev@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Cong Wang <xiyou.wangcong@...il.com>
Subject: Re: general protection fault in qdisc_put

On Sat, Sep 7, 2019 at 11:08 PM syzbot
<syzbot+d5870a903591faaca4ae@...kaller.appspotmail.com> wrote:
>
> The bug was bisected to:
>
> commit e41d58185f1444368873d4d7422f7664a68be61d
> Author: Dmitry Vyukov <dvyukov@...gle.com>
> Date:   Wed Jul 12 21:34:35 2017 +0000
>
>      fault-inject: support systematic fault injection

That commit does seem a bit questionable, but not the cause of this
problem (just the trigger).

I think the questionable part is that the new code doesn't honor the
task filtering, and will fail even for protected tasks. Dmitry?

> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 9699 Comm: syz-executor169 Not tainted 5.3.0-rc7+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:qdisc_put+0x25/0x90 net/sched/sch_generic.c:983

Yes, looks like 'qdisc' is NULL.

This is the

        qdisc_put(q->qdisc);

in sfb_destroy(), called from qdisc_create().

I think what is happening is this (in qdisc_create()):

        if (ops->init) {
                err = ops->init(sch, tca[TCA_OPTIONS], extack);
                if (err != 0)
                        goto err_out5;
        }
        ...
err_out5:
        /* ops->init() failed, we call ->destroy() like qdisc_create_dflt() */
        if (ops->destroy)
                ops->destroy(sch);

and "ops->init" is sfb_init(), which will not initialize q->qdisc if
tcf_block_get() fails.

I see two solutions:

 (a) move the

        q->qdisc = &noop_qdisc;

     up earlier in sfb_init(), so that qdisc is always initialized
after sfb_init(), even on failure.

 (b) just make qdisc_put(NULL) just silently work as a no-op.

 (c) change all the semantics to not call ->destroy if ->init failed.

Honestly, (a) seems very fragile - do all the other init routines do
this? And (c) sounds like a big change, and very fragile too.

So I'd suggest that qdisc_put() be made to just ignore a NULL pointer
(and maybe an error pointer too?).

But I'll leave it to the maintainers to sort out the proper fix.
Maybe people prefer (a)?

                   Linus