lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 20 Sep 2021 13:52:21 -0700 From: Dan Williams <dan.j.williams@...el.com> To: Luis Chamberlain <mcgrof@...nel.org> Cc: Tejun Heo <tj@...nel.org>, Greg KH <gregkh@...uxfoundation.org>, Andrew Morton <akpm@...ux-foundation.org>, Minchan Kim <minchan@...nel.org>, jeyu@...nel.org, shuah <shuah@...nel.org>, Randy Dunlap <rdunlap@...radead.org>, "Rafael J. Wysocki" <rafael@...nel.org>, Masahiro Yamada <masahiroy@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>, yzaikin@...gle.com, Nathan Chancellor <nathan@...nel.org>, ojeda@...nel.org, Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>, vitor@...saru.org, elver@...gle.com, Jarkko Sakkinen <jarkko@...nel.org>, Alexander Potapenko <glider@...gle.com>, rf@...nsource.cirrus.com, Stephen Hemminger <stephen@...workplumber.org>, David Laight <David.Laight@...lab.com>, bvanassche@....org, jolsa@...nel.org, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, trishalfonso@...gle.com, andreyknvl@...il.com, Jiri Kosina <jikos@...nel.org>, mbenes@...e.com, Nitin Gupta <ngupta@...are.org>, Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>, Reinette Chatre <reinette.chatre@...el.com>, Fenghua Yu <fenghua.yu@...el.com>, Borislav Petkov <bp@...en8.de>, X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>, lizefan.x@...edance.com, Johannes Weiner <hannes@...xchg.org>, Daniel Vetter <daniel.vetter@...ll.ch>, Bjorn Helgaas <bhelgaas@...gle.com>, Krzysztof WilczyĆski <kw@...ux.com>, senozhatsky@...omium.org, Christoph Hellwig <hch@....de>, Joe Perches <joe@...ches.com>, hkallweit1@...il.com, Jens Axboe <axboe@...nel.dk>, Josh Poimboeuf <jpoimboe@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, Kees Cook <keescook@...omium.org>, Steven Rostedt <rostedt@...dmis.org>, Peter Zijlstra <peterz@...radead.org>, linux-spdx@...r.kernel.org, Linux Doc Mailing List <linux-doc@...r.kernel.org>, linux-block@...r.kernel.org, linux-fsdevel <linux-fsdevel@...r.kernel.org>, linux-kselftest@...r.kernel.org, cgroups@...r.kernel.org, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, copyleft-next@...ts.fedorahosted.org Subject: Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain <mcgrof@...nel.org> wrote: > > When sysfs attributes use a lock also used on module removal we can > race to deadlock. This happens when for instance a sysfs file on > a driver is used, then at the same time we have module removal call > trigger. The module removal call code holds a lock, and then the sysfs > file entry waits for the same lock. While holding the lock the module > removal tries to remove the sysfs entries, but these cannot be removed > yet as one is waiting for a lock. This won't complete as the lock is > already held. Likewise module removal cannot complete, and so we deadlock. > > This can now be easily reproducible with our sysfs selftest as follows: > > ./tools/testing/selftests/sysfs/sysfs.sh -t 0027 > > To fix this we extend the struct kernfs_node with a module reference and > use the try_module_get() after kernfs_get_active() is called which > protects integrity and the existence of the kernfs node during the > operation. > > So long as the kernfs node is protected with kernfs_get_active() we know > we can rely on its contents. And, as now just documented in the previous > patch, we also now know that once kernfs_get_active() is called the module > is also guarded to exist and cannot be removed. > > If try_module_get() fails we fail the operation on the kernfs node. > > We use a try method as a full lock means we'd then make our sysfs > attributes busy us out from possible module removal, and so userspace > could force denying module removal, a silly form of "DOS" against module > removal. A try lock on the module removal ensures we give priority to > module removal and interacting with sysfs attributes only comes second. > Using a full lock could mean for instance that if you don't stop poking > at sysfs files you cannot remove a module. > > Races between removal of sysfs files and the module are not possible > given sysfs files are created by the same module, and when a sysfs file > is being used kernfs prevents removal of the sysfs file. So if module > removal is actually happening the removal would have to wait until > the sysfs file operation is complete. > > This deadlock was first reported with the zram driver, however the live > patching folks have acknowledged they have observed this as well with > live patching, when a live patch is removed. I was then able to > reproduce easily by creating a dedicated selftests. > > A sketch of how this can happen follows: > > CPU A CPU B > whatever_store() > module_unload > mutex_lock(foo) > mutex_lock(foo) > del_gendisk(zram->disk); > device_del() > device_remove_groups() This flow seems possible to trigger with: echo $dev > /sys/bus/$bus/drivers/$driver/unbind I am missing why module pinning is part of the solution when it's the device_del() path that is racing? Module removal is just a more coarse grained way to trigger unbind => device_del(). Isn't the above a bug in the driver, not missing synchronization in kernfs? Forgive me if the unbind question was asked and answered elsewhere, this is my first time taking a look at this series.
Powered by blists - more mailing lists