lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231109212251.213873-7-johannes@sipsolutions.net>
Date:   Thu,  9 Nov 2023 22:22:52 +0100
From:   Johannes Berg <johannes@...solutions.net>
To:     linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     linux-fsdevel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Nicolai Stange <nicstange@...il.com>,
        Ben Greear <greearb@...delatech.com>
Subject: [RFC PATCH 0/6] debugfs/wifi: locking fixes

Hi,

So ... this is a bit complex.

Ben found [1] that since the wireless locking rework [2] we have
a deadlock in the debugfs use in the wireless stack, since some
objects (netdevs, links, stations) can be removed while holding
the wiphy->mtx, where as the files (all netdev/link and agg_status
for stations) also acquire the wiphy->mtx. This of course leads
to deadlocks, since Nicolai's proxy_fops work [3] we wait for the
users of the file to finish before removing them, which they
cannot in this case:

thread 1		thread 2
lock(wiphy->mutex)
			read()/write()
			 -> lock(wiphy->mutex) // waits for mutex
debugfs_remove()
 -> wait_for_users() // cannot finish


Unfortunately, there's no good way to remove the debugfs files
*before* locking in wireless, since we may not even know which
object needs to get removed, etc. Also, some files may need to
be removed based on other actions, and we want/need to free the
objects.


I went back and forth on how to fix it, and Ben had some hacks
in the threads, but in the end I decided on the approach taken
in this patchset.

So I have
 * debugfs: fix automount d_fsdata usage

   This patch fixes a bug in the existing automount case in
   debugfs, if that dentry were ever removed, we'd try to
   kfree() the function pointer. I previously tried to fix
   this differently [4] but that doesn't work, so here I
   just allocate a debugfs fsdata for automount, there's a
   single instance of this in the tree ...

 * debugfs: annotate debugfs handlers vs. removal with lockdep

   This just makes the problem obvious, whether in wireless
   or elsewhere, by annotating it with lockdep so that we get
   complaints about the deadlock described above. I've checked
   that the deadlock in wireless actually gets reported.

 * debugfs: add API to allow debugfs operations cancellation

   This adds a bit of infrastructure in debugfs to allow for
   cancellation of read/write/... handlers when remove() is
   called for a file. The API is written more generically,
   so that it could also be used to e.g. cancel operations in
   hardware/firmware, for example, if debugfs does accesses
   to that.

 * wifi: cfg80211: add locked debugfs wrappers

   I went back and forth on this, but in the end this seemed
   the easiest approach. Using these new helpers from debugfs
   files that are removed under the wiphy lock is safe, at
   the expense of pushing the read/write functions into a new
   wiphy work, which is called with wiphy->mutex held. This
   then uses the debugfs API introduced in the previous patch
   to cancel operations that are pending while the file is
   removed.

 * wifi: mac80211: use wiphy locked debugfs helpers for agg_status
 * wifi: mac80211: use wiphy locked debugfs for sdata/link

   These convert the files that actually have the problem in
   mac80211 to use the new helpers.

Any comments would be appreciated :-)

[1] https://lore.kernel.org/r/56d0b043-0585-5380-5703-f25d9a42f39d@candelatech.com
[2] in particular commit 0ab6cba0696d ("wifi: mac80211: hold wiphy lock in netdev/link debugfs")
    but there's a lot more work that went into it
[3] commit e9117a5a4bf6 ("debugfs: implement per-file removal protection")
[4] https://lore.kernel.org/lkml/20231109160639.514a2568f1e7.I64fe5615568e87f9ae2d7fb2ac4e5fa96924cb50@changeid/

johannes


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ