lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFqZXNt0Xp1j7+hTrV9XZ936Yz+H8Le0pqazhLr3drO0tEzB2w@mail.gmail.com>
Date:   Mon, 7 Feb 2022 16:15:27 +0100
From:   Ondrej Mosnacek <omosnace@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Waiman Long <longman@...hat.com>
Cc:     Linux kernel mailing list <linux-kernel@...r.kernel.org>,
        SElinux list <selinux@...r.kernel.org>
Subject: Semantics vs. usage of mutex_is_locked()

Hello,

(This is addressed mainly to the kernel/locking/ maintainers.)

In security/selinux/ima.c, we have two functions for which we want to
assert the expected locking status of a mutex. In the first function
we expect the caller to obtain the lock, so we have
`WARN_ON(!mutex_is_locked(&state->policy_mutex));` there. The second
one, on the contrary, takes the lock on its own, so there is an
inverse assert (that the caller hasn't already taken the lock) -
`WARN_ON(mutex_is_locked(&state->policy_mutex));`.

Recently, I got a report that the second WARN_ON() got triggered,
while there was no function in the call chain that could have taken
the lock. Looking into it, I realized that mutex_is_locked() actually
doesn't check what we assumed ("Are we holding the lock?"), but
instead answers the question "Is any task holding the lock?". So in
theory it can happen that the second WARN_ON() gets hit randomly in an
otherwise correct code simply because some other task happens to be
holding the mutex. Similarly, the first assert might not catch all
cases where taking the mutex was forgotten, because another task may
be holding it, making the assert pass.

Grepping the whole tree for mutex_is_locked finds about 300 uses, the
vast majority of which are variations of the
warn-if-mutex-not-locked-by-us pattern. Then there are a handful of
cases where the usage of mutex_is_locked() seems correct and a few
cases of the inverse warn-if-mutex-already-locked-by-us pattern.

It seems like introducing a new helper with the "is the mutex locked
by current task?" semantics would be fairly straightforward, however
fixing all the mutex_is_locked() misuses would be a rather big and
noisy patch(set). That said, would it be okay if I send patches that
introduce a new helper and only fix misuses that can lead to wrong
behavior when the code is correct (e.g. can yield a false positive
WARNING/BUG) and documentation? That should be a reasonably small set
of changes, yet should take care of the most important issues. If
anyone cares enough for the rest, they can always send further
patches.

Also, any opinions on the name of the new helper? Perhaps
mutex_is_held()? Or mutex_is_locked_by_current()?

Thanks,

--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ