lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 22 Sep 2022 07:37:16 -0700 From: Guenter Roeck <linux@...ck-us.net> To: Alexey Klimov <klimov.linux@...il.com>, wim@...ux-watchdog.org Cc: linux-watchdog@...r.kernel.org, gregkh@...uxfoundation.org, oneukum@...e.com, linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org, atishp@...osinc.com, atishp@...shpatra.org, yury.norov@...il.com, aklimov@...hat.com, atomlin@...hat.com, stern@...land.harvard.edu Subject: Re: [PATCH v6] watchdog: add driver for StreamLabs USB watchdog device On 9/16/22 20:15, Alexey Klimov wrote: > Hi Wim/Guenter, > > For me it seems that there could be a potential race condition. I have to rely > on watchdog_active(&streamlabs_wdt->wdt_dev) function which tests the WDOG_ACTIVE > bit in struct watchdog_device->status member. > The watchdog_dev changes the state of the device with ->start() or ->ping() and > ->stop() methods and updates the WDOG_ACTIVE accordingly. > In {pre,post}_reset methods here I have to change the state of the device from > running to stopped and back to running conditionally, however WDOG_ACTIVE bit > could be updated in between these callbacks execution or starting/stopping > the device can race. > For instance, I see the potential dangerous race like this: > > CPUX CPUY > > .. watchdog_stop() { > .. if (wdd->ops->stop) { > ... > err = wdd->ops->stop(wdd) > } > usb_streamlabs_wdt_pre_reset() { > if (watchdog_active()) > stop_command(); /* WDOG_ACTIVE bit is still set > ... here indicating that watchdog is > } started, but ->stop() has already > finished */ > ... > usb_streamlabs_wdt_post_reset() { > if (watchdog_active()) > start_command(); > } > ... /* WDOG_ACTIVE is updated here */ > clear_bit(WDOG_ACTIVE, &wdd->status); > } > > As a result, the watchdog subsystem "thinks" that watchdog is not active and should > not be pinged. However, the driver observed using watchdog_active() that watchdog > was active during {pre,post}_reset and restarted the device which will lead to > unexpected reset. It is very unlikely race to happen but consequence is fatal. > In other words, there are two independent paths leading to driver changing > the state of the watchdog device and one path relies on status that can be changed > by another path. > > Thinking about that I see the following approaches: > > 1. Introduce a variable in struct streamlabs_wdt that tracks the state of the > watchdog device itself and checking/updating the state of a device happens under > semaphore lock. > Obviously, this "internal" to the driver state variable should be used in > {pre,post}_reset. In case there will be other drivers (say, USB ones) they also > need to implement this. > > or > > 2. The updates to wdd->status should happen under wd_data->lock. > Currently, it is mutex-based. The acquiring and releasing the lock could be > exported for the drivers to use. The mutex lock probably should be switched > to a binary semaphore for that. > > In such case, in pre_reset() for example, I would need to do: > static int pre_reset() > { > lock_wdd(); > acquire_internal_driver_lock(); > > if (watchdog_active()) > stop_command(); > } > > static int post_reset() > { > > if (watchdog_active()) > start_command(); > > release_internal_driver_lock(); > unlock_wdd(); > } > > There should be an order that we have to acquire subsystem wdd lock first, then > internal driver lock. Otherwise there could be deadlocks. > > This could be done if you think it's more wiser move. > > or > > 3. The {pre,post}_reset callbacks should execute watchdog_dev.c subsystem functions > (not sure which functions exactly). Eventually, it will look similar to what is > described in the previous point with respect to locks order. > I meant something like this: > > static int pre_reset() > { > watchdog_dev_pre_reset_prepare(); > } > > static int post_reset() > { > watchdog_dev_post_reset_done(); > } > > In watchdog_dev.c: > void watchdog_dev_pre_reset_prepare() > { > mutex_lock(&wd_data->lock); <-- should be changed to semaphore too? > > watchdog_stop(wdd); <-- without updating WDOG_ACTIVE bit? > or there should be a way to indicate > to watchdog_dev_post_reset_done() if > watchdog should be started or not > } > > void watchdog_dev_post_reset_done() > { > if (watchdog_active()) > watchdog_start(wdd); > > mutex_unlock(&wd_data->lock); > } > > I didn't really thought about point 3 yet. For me personally the point 2 seems > the like right way to go but you have more experience with that. The exported > locks could be re-used by other drivers if needed in future. > In case of point 1 each USB driver should deal with {pre,post}_reset by themselves. > > Any thoughts? > Please go with 1). pre_reset/post_reset functionality is a first in the watchdog subsystem and the first to require locking outside the scope of a function or set of functions. I'd rather avoid having to deal with the potential consequences in the watchdog core. We can do that if/when it becomes more common and after we have a good understanding of the potential consequences. Thanks, Guenter > Thanks, > Alexey
Powered by blists - more mailing lists