lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=MdtwLRHG67khVgLeH3JXjqqSr8Pgg74oZPPwZhnYp=yXA@mail.gmail.com>
Date: Mon, 12 Feb 2024 11:07:34 +0100
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Linus Walleij <linus.walleij@...aro.org>, Kent Gibson <warthog618@...il.com>, 
	Alex Elder <elder@...aro.org>, Geert Uytterhoeven <geert+renesas@...der.be>, 
	"Paul E . McKenney" <paulmck@...nel.org>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, 
	Wolfram Sang <wsa@...-dreams.de>
Cc: linux-gpio@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [PATCH v3 00/24] gpio: rework locking and object life-time control

On Thu, Feb 8, 2024 at 10:59 AM Bartosz Golaszewski <brgl@...ev.pl> wrote:
>
> From: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
>
> This is a big rework of locking in GPIOLIB. The current serialization is
> pretty much useless. There is one big spinlock (gpio_lock) that "protects"
> both the GPIO device list, GPIO descriptor access and who knows what else.
>
> I'm putting "protects" in quotes as in several places the lock is
> taken, released whenever a sleeping function is called and re-taken
> without regards for the "protected" state that may have changed.
>
> First a little background on what we're dealing with in GPIOLIB. We have
> consumer API functions that can be called from any context explicitly
> (get/set value, set direction) as well as many others which will get
> called in atomic context implicitly (e.g. set config called in certain
> situations from gpiod_direction_output()).
>
> On the other side: we have GPIO provider drivers whose callbacks may or
> may not sleep depending on the underlying protocol.
>
> This makes any attempts at serialization quite complex. We typically
> cannot use sleeping locks - we may be called from atomic - but we also
> often cannot use spinlocks - provider callbacks may sleep. Moreover: we
> have close ties with the interrupt and pinctrl subsystems, often either
> calling into them or getting called from them. They use their own locking
> schemes which are at odds with ours (pinctrl uses mutexes, the interrupt
> subsystem can call GPIO helpers with spinlock taken).
>
> There is also another significant issue: the GPIO device object contains
> a pointer to gpio_chip which is the implementation of the GPIO provider.
> This object can be removed at any point - as GPIOLIB officially supports
> hotplugging with all the dynamic expanders that we provide drivers for -
> and leave the GPIO API callbacks with a suddenly NULL pointer. This is
> a problem that allowed user-space processes to easily crash the kernel
> until we patched it with a read-write semaphore in the user-space facing
> code (but the problem still exists for in-kernel users). This was
> recognized before as evidenced by the implementation of validate_desc()
> but without proper serialization, simple checking for a NULL pointer is
> pointless and we do need a generic solution for that issue as well.
>
> If we want to get it right - the more lockless we go, the better. This is
> why SRCU seems to be the right candidate for the mechanism to use. In fact
> it's the only mechanism we can use our read-only critical sections to be
> called from atomic and protecc contexts as well as call driver callbacks
> that may sleep (for the latter case).
>
> We're going to use it in three places: to protect the global list of GPIO
> devices, to ensure consistency when dereferencing the chip pointer in GPIO
> device struct and finally to ensure that users can access GPIO descriptors
> and always see a consistent state.
>
> We do NOT serialize all API callbacks. This means that provider callbacks
> may be called simultaneously and GPIO drivers need to provide their own
> locking if needed. This is on purpose. First: we only support exclusive
> GPIO usage* so there's no risk of two drivers getting in each other's way
> over the same GPIO. Second: with this series, we ensure enough consistency
> to limit the chance of drivers or user-space users crashing the kernel.
> With additional improvements in handling the flags field in GPIO
> descriptors there's very little to gain, while bitbanging drivers may care
> about the increased performance of going lockless.
>
> This series brings in one somewhat significant functional change for
> in-kernel users, namely: GPIO API calls, for which the underlying GPIO
> chip is gone, will no longer return 0 and emit a log message but instead
> will return -ENODEV.
>
> I know this is a lot of code to go through but the more eyes we get on it
> the better.
>
> Thanks,
> Bartosz
>
> * - This is not technically true. We do provide the
> GPIOD_FLAGS_BIT_NONEXCLUSIVE flag. However this is just another piece of
> technical debt. This is a hack provided for a single use-case in the
> regulator framework which got out of control and is now used in many
> places that should have never touched it. It's utterly broken and doesn't
> even provide any contract as to what a "shared GPIO" is. I would argue
> that it's the next thing we should address by providing "reference counted
> GPIO enable", not just a flag allowing to request the same GPIO twice
> and then allow two drivers to fight over who toggles it as is the case
> now. For now, let's just treat users of GPIOD_FLAGS_BIT_NONEXCLUSIVE like
> they're consciously and deliberately choosing to risk undefined behavior.
>
> v2 -> v3:
> - fix SRCU cleanup in error path
> - add a comment explaining the use of gpio_device_find() in sysfs code
> - don't needlessly dereference gdev->chip in sysfs code
> - don't return 1 (INPUT) for NULL descriptors from gpiod_get_direction(),
>   rather: return -EINVAL
> - fix debugfs code: take the SRCU read lock in .start() and release it in
>   .stop() callbacks for seq file instead of taking it locally which
>   doesn't protect the entire seq printout
> - move the removal of the GPIO device from the list before setting the
>   chip pointer to NULL
>
> v1 -> v2:
> - fix jumping over variable initialization in sysfs code
> - fix RCU-related sparse warnings
> - fix a smatch complaint about uninitialized variables (even though it's
>   a false positive coming from the fact that scoped_guard() is implemented
>   as a for loop
> - fix a potential NULL-pointer dereference in debugfs callbacks
> - improve commit messages
>
> Bartosz Golaszewski (24):
>   gpio: protect the list of GPIO devices with SRCU
>   gpio: of: assign and read the hog pointer atomically
>   gpio: remove unused logging helpers
>   gpio: provide and use gpiod_get_label()
>   gpio: don't set label from irq helpers
>   gpio: add SRCU infrastructure to struct gpio_desc
>   gpio: protect the descriptor label with SRCU
>   gpio: sysfs: use gpio_device_find() to iterate over existing devices
>   gpio: remove gpio_lock
>   gpio: reinforce desc->flags handling
>   gpio: remove unneeded code from gpio_device_get_desc()
>   gpio: sysfs: extend the critical section for unregistering sysfs
>     devices
>   gpio: sysfs: pass the GPIO device - not chip - to sysfs callbacks
>   gpio: cdev: replace gpiochip_get_desc() with gpio_device_get_desc()
>   gpio: cdev: don't access gdev->chip if it's not needed
>   gpio: sysfs: don't access gdev->chip if it's not needed
>   gpio: don't dereference gdev->chip in gpiochip_setup_dev()
>   gpio: reduce the functionality of validate_desc()
>   gpio: remove unnecessary checks from gpiod_to_chip()
>   gpio: add the can_sleep flag to struct gpio_device
>   gpio: add SRCU infrastructure to struct gpio_device
>   gpio: protect the pointer to gpio_chip in gpio_device with SRCU
>   gpio: remove the RW semaphore from the GPIO device
>   gpio: mark unsafe gpio_chip manipulators as deprecated
>
>  drivers/gpio/gpiolib-cdev.c  |  92 ++--
>  drivers/gpio/gpiolib-of.c    |   4 +-
>  drivers/gpio/gpiolib-sysfs.c | 151 ++++---
>  drivers/gpio/gpiolib.c       | 791 +++++++++++++++++++----------------
>  drivers/gpio/gpiolib.h       |  86 ++--
>  5 files changed, 635 insertions(+), 489 deletions(-)
>
> --
> 2.40.1
>

Applied the series. I'll let it spend some time in next and let's fix
any remaining issues in tree.

Bart

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ