lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250507014728.6094-1-changwoo@igalia.com>
Date: Wed,  7 May 2025 10:47:28 +0900
From: Changwoo Min <changwoo@...lia.com>
To: lukasz.luba@....com,
	rafael@...nel.org,
	len.brown@...el.com,
	pavel@...nel.org
Cc: christian.loehle@....com,
	tj@...nel.org,
	kernel-dev@...lia.com,
	linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Changwoo Min <changwoo@...lia.com>
Subject: [PATCH v2] PM: EM: Add inotify support when the energy model is updated.

The sched_ext schedulers [1] currently access the energy model through the
debugfs to make energy-aware scheduling decisions [2]. The userspace part
of a sched_ext scheduler feeds the necessary (post-processed) energy-model
information to the BPF part of the scheduler.

However, there is a limitation in the current debugfs support of the energy
model. When the energy model is updated (em_dev_update_perf_domain), there
is no way for the userspace part to know such changes (besides polling the
debugfs files).

Therefore, add inotify support (IN_MODIFY) when the energy model is updated.
With this inotify support, the directory of an updated performance domain
(e.g., /sys/kernel/debug/energy_model/cpu0) and its parent directory (e.g.,
/sys/kernel/debug/energy_model) are inotified. Therefore, a sched_ext
scheduler (or any userspace application) monitors the energy model change
in userspace using the regular inotify interface.

Note that accessing the energy model information from userspace has many
advantages over other alternatives, especially adding new BPF kfuncs. The
userspace has much more freedom than the BPF code (e.g., using external
libraries and floating point arithmetics), which may be infeasible (if not
impossible) in the BPF/kernel code.

[1] https://lwn.net/Articles/922405/
[2] https://github.com/sched-ext/scx/pull/1624

Signed-off-by: Changwoo Min <changwoo@...lia.com>
---

ChangeLog v1 -> v2:
  - Change em_debug_update() to only inotify the directory of an updated
    performance domain (and its parent directory).
  - Move the em_debug_update() call outside of the mutex lock.
  - Update the commit message to clarify its motivation and what will be
    inotified when updated. 

 kernel/power/energy_model.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index d9b7e2b38c7a..590e90e8cb66 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -14,6 +14,7 @@
 #include <linux/cpumask.h>
 #include <linux/debugfs.h>
 #include <linux/energy_model.h>
+#include <linux/fsnotify.h>
 #include <linux/sched/topology.h>
 #include <linux/slab.h>
 
@@ -156,9 +157,18 @@ static int __init em_debug_init(void)
 	return 0;
 }
 fs_initcall(em_debug_init);
+
+static void em_debug_update(struct device *dev)
+{
+	struct dentry *d;
+
+	d = debugfs_lookup(dev_name(dev), rootdir);
+	fsnotify_dentry(d, FS_MODIFY);
+}
 #else /* CONFIG_DEBUG_FS */
 static void em_debug_create_pd(struct device *dev) {}
 static void em_debug_remove_pd(struct device *dev) {}
+static void em_debug_update(struct device *dev) {}
 #endif
 
 static void em_release_table_kref(struct kref *kref)
@@ -324,6 +334,8 @@ int em_dev_update_perf_domain(struct device *dev,
 	em_table_free(old_table);
 
 	mutex_unlock(&em_pd_mutex);
+
+	em_debug_update(dev);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(em_dev_update_perf_domain);
-- 
2.49.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ