linux-kernel - Re: [syzbot] general protection fault in __device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+bBWrLRwiowaWk8o4+XAtCHxxJiEQfiSkgM3BDut9atAw@mail.gmail.com>
Date:   Sat, 4 Jun 2022 10:32:46 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     Alan Stern <stern@...land.harvard.edu>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        syzbot <syzbot+dd3c97de244683533381@...kaller.appspotmail.com>,
        hdanton@...a.com, lenb@...nel.org, linux-acpi@...r.kernel.org,
        linux-kernel@...r.kernel.org, rafael.j.wysocki@...el.com,
        rafael@...nel.org, rjw@...ysocki.net,
        syzkaller-bugs@...glegroups.com, linux-usb@...r.kernel.org
Subject: Re: [syzbot] general protection fault in __device_attach

On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@...uxfoundation.org> wrote:
> > > > > > syzbot has bisected this issue to:
> > > > > >
> > > > > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > > > > > Author: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
> > > > > > Date:   Fri Jun 18 13:41:27 2021 +0000
> > > > > >
> > > > > >     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> > > > >
> > > > > Hmm... It's not obvious at all how this change can alter the behaviour so
> > > > > drastically. device_add() is called from USB core with intf->dev.name == NULL
> > > > > by some reason. A-ha, seems like fault injector, which looks like
> > > > >
> > > > >         dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> > > > >                      dev->devpath, configuration, ifnum);
> > > > >
> > > > > missed the return code check.
> > > > >
> > > > > But I'm not familiar with that code at all, adding Linux USB ML and Alan.
> > > >
> > > > I can't see any connection between this bug and acpi/sysfs.c.  Is it a
> > > > bad bisection?
> > > >
> > > > It looks like you're right about dev_set_name() failing.  In fact, the
> > > > kernel appears to be littered with calls to that routine which do not
> > > > check the return code (the entire subtree below drivers/usb/ contains
> > > > only _one_ call that does check the return code!).  The function doesn't
> > > > have any __must_check annotation, and its kerneldoc doesn't mention the
> > > > return code or the possibility of a failure.
> > > >
> > > > Apparently the assumption is that if dev_set_name() fails then
> > > > device_add() later on will also fail, and the problem will be detected
> > > > then.
> > > >
> > > > So now what should happen when device_add() for an interface fails in
> > > > usb_set_configuration()?
> > >
> > > But how can that really fail on a real system?
> > >
> > > Is this just due to error-injection stuff?  If so, I'm really loath to
> > > rework the world for something that can never happen in real life.
> > >
> > > Or is this a real syzbot-found-with-reproducer issue?
> >
> > Aren't there quite a few reasons why device_add() might fail?  (Although
> > most of them probably are memory allocation errors...)
>
> I was thinking of the dev_set_name() issue further back in the call
> chain.
>
> > Basically, you have to make up your mind.  If a function can fail, you
> > should be prepared to handle the failure.  If it can't fail, there's no
> > point in even checking the return code.
>
> True, ok, we should unwind the mess.  I'll try to look at it after the
> merge window...
>
> But again, is this a "real and able to be triggered from userspace"
> problem, or just fault-injection-induced?

Then this is something to fix in the fault injection subsystem.
Testing systems shouldn't be reporting false positives.
What allocations cannot fail in real life? Is it <=page_size?