linux-kernel - [PATCH v3] pstore: Revert pmsg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230308204043.2061631-1-jstultz@google.com>
Date:   Wed,  8 Mar 2023 20:40:43 +0000
From:   John Stultz <jstultz@...gle.com>
To:     LKML <linux-kernel@...r.kernel.org>
Cc:     John Stultz <jstultz@...gle.com>, Wei Wang <wvw@...gle.com>,
        Midas Chien <midaschieh@...gle.com>,
        "Chunhui Li (李春辉)" 
        <chunhui.li@...iatek.com>, Steven Rostedt <rostedt@...dmis.org>,
        Kees Cook <keescook@...omium.org>,
        Anton Vorontsov <anton@...msg.org>,
        "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
        Tony Luck <tony.luck@...el.com>, kernel-team@...roid.com
Subject: [PATCH v3] pstore: Revert pmsg_lock back to a normal mutex

This reverts commit 76d62f24db07f22ccf9bc18ca793c27d4ebef721.

So while priority inversion on the pmsg_lock is an occasional
problem that an rt_mutex would help with, in uses where logging
is writing to pmsg heavily from multiple threads, the pmsg_lock
can be heavily contended.

After this change landed, it was reported that cases where the
mutex locking overhead was commonly adding on the order of 10s
of usecs delay had suddenly jumped to ~msec delay with rtmutex.

It seems the slight differences in the locks under this level
of contention causes the normal mutexes to utilize the spinning
optimizations, while the rtmutexes end up in the sleeping
slowpath (which allows additional threads to pile on trying
to take the lock).

In this case, it devolves to a worse case senerio where the lock
acquisition and scheduling overhead dominates, and each thread
is waiting on the order of ~ms to do ~us of work.

Obviously, having tons of threads all contending on a single
lock for logging is non-optimal, so the proper fix is probably
reworking pstore pmsg to have per-cpu buffers so we don't have
contention.

Additionally, Steven Rostedt has provided some furhter
optimizations for rtmutexes that improves the rtmutex spinning
path, but at least in my testing, I still see the test tripping
into the sleeping path on rtmutexes while utilizing the spinning
path with mutexes.

But in the short term, lets revert the change to the rt_mutex
and go back to normal mutexes to avoid a potentially major
performance regression. And we can work on optimizations to both
rtmutexes and finer-grained locking for pstore pmsg in the
future.

Cc: Wei Wang <wvw@...gle.com>
Cc: Midas Chien<midaschieh@...gle.com>
Cc: "Chunhui Li (李春辉)" <chunhui.li@...iatek.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Kees Cook <keescook@...omium.org>
Cc: Anton Vorontsov <anton@...msg.org>
Cc: "Guilherme G. Piccoli" <gpiccoli@...lia.com>
Cc: Tony Luck <tony.luck@...el.com>
Cc: kernel-team@...roid.com
Fixes: 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion")
Reported-by: "Chunhui Li (李春辉)" <chunhui.li@...iatek.com>
Signed-off-by: John Stultz <jstultz@...gle.com>
---
v2:
* Fix quoting around Chunhui Li's email name (so they are actually
  cc'ed)
* Added tested by tag
v3:
* Reworded commit message per Steven's feedback and provided more
  context for reverting in the short term.
---
 fs/pstore/pmsg.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/pstore/pmsg.c b/fs/pstore/pmsg.c
index ab82e5f05346..b31c9c72d90b 100644
--- a/fs/pstore/pmsg.c
+++ b/fs/pstore/pmsg.c
@@ -7,10 +7,9 @@
 #include <linux/device.h>
 #include <linux/fs.h>
 #include <linux/uaccess.h>
-#include <linux/rtmutex.h>
 #include "internal.h"

-static DEFINE_RT_MUTEX(pmsg_lock);
+static DEFINE_MUTEX(pmsg_lock);

 static ssize_t write_pmsg(struct file *file, const char __user *buf,
 			  size_t count, loff_t *ppos)
@@ -29,9 +28,9 @@ static ssize_t write_pmsg(struct file *file, const char __user *buf,
 	if (!access_ok(buf, count))
 		return -EFAULT;

-	rt_mutex_lock(&pmsg_lock);
+	mutex_lock(&pmsg_lock);
 	ret = psinfo->write_user(&record, buf);
-	rt_mutex_unlock(&pmsg_lock);
+	mutex_unlock(&pmsg_lock);
 	return ret ? ret : count;
 }

-- 
2.40.0.rc1.284.g88254d51c5-goog