lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y4FgR3EmYNVKItO2@sol>
Date:   Sat, 26 Nov 2022 08:39:35 +0800
From:   Kent Gibson <warthog618@...il.com>
To:     Bartosz Golaszewski <brgl@...ev.pl>
Cc:     Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        linux-gpio@...r.kernel.org, linux-kernel@...r.kernel.org,
        Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [PATCH] gpiolib: cdev: fix NULL-pointer dereferences

On Fri, Nov 25, 2022 at 10:03:06PM +0100, Bartosz Golaszewski wrote:
> On Fri, Nov 25, 2022 at 6:56 PM Andy Shevchenko
> <andriy.shevchenko@...ux.intel.com> wrote:
> >
> > On Fri, Nov 25, 2022 at 05:48:02PM +0100, Bartosz Golaszewski wrote:
> > > On Fri, Nov 25, 2022 at 5:24 PM Kent Gibson <warthog618@...il.com> wrote:
> >
> > ...
> >
> > > Then at the subsystem level, the GPIO device struct would need a lock
> > > that would be taken by every user-space operation AND the code
> > > unregistering the device so that we don't do what you described (i.e.
> > > if there's a thread doing a read(), then let's wait until it returns
> > > before we drop the device).
> >
> > It's called a reference counting, basically you need to get device and then
> > put when it makes sense.
> >
> 
> Andy: I am aware of struct device reference counting but this isn't
> it. You can count references all you want, but when I disconnect my
> CP2112, the USB bus calls gpiochip_remove(), struct gpio_chip * inside
> struct gpio_device is set to NULL and while the underlying struct
> device itself is still alive, the GPIO chip is no longer usable.
> 
> Reference counting won't help because the device is no longer there,
> so this behavior is correct but there's an issue with user-space still
> being able to hold certain resources and we need to make sure that
> when it tries to use them, we return an error instead of crashing.
> 
> I think that a good solution is to make sure, we cannot set gdev->gc
> to NULL as long as there are user-space operations in progress. After
> all, it's better to try to send a USB request to an unplugged device
> than to dereference a NULL pointer. To that end, we could have a
> user-space lock that would also be taken by gpiochip_remove().
> 

This is basically the answer I was hoping for - that there is some
barrier in place to prevent chip removal while an ioctl is active.
Then the check makes total sense - it is ensuring that the chip wasn't
removed before the ioctl began and the barrier went up.

On the other end, the caller of gpiochip_remove() needs to be prepared
to gracefully fail calls on the chip until gpiochip_remove() returns.
You would hope that is already the case...

> But this is still a per-subsystem solution. Most other subsystems
> suffer from the same issue.
> 

Does that prevent us addressing the problem in gpio until a more general
solution comes along?

Anyway, I'm basically ok with your patch as a first step, as it greatly
reduces the chances of triggering the fault, but it is only a band-aid
over a larger issue and a more complete solution would be preferable.
Without that, highlight in the checkin comment that it is not a complete
fix.

Cheers,
Kent.

Powered by blists - more mailing lists