lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZcN8zA4WaLx7c3qy@tpad>
Date: Wed, 7 Feb 2024 09:51:24 -0300
From: Marcelo Tosatti <mtosatti@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org,
	Daniel Bristot de Oliveira <bristot@...nel.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Frederic Weisbecker <frederic@...nel.org>,
	Leonardo Bras <leobras@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [patch 04/12] clockevent unbind: use smp_call_func_single_fail

On Wed, Feb 07, 2024 at 12:55:59PM +0100, Thomas Gleixner wrote:
> On Tue, Feb 06 2024 at 15:49, Marcelo Tosatti wrote:
> > Convert clockevents_unbind from smp_call_function_single
> > to smp_call_func_single_fail, which will fail in case
> > the target CPU is tagged as block interference CPU.
> 
> Seriously? This is a root only operation. So if root wants to interfere
> then so be it.

Hi Thomas!

OK, so the problem is the following: due to software complexity, one is
often not aware of all operations that might take place.

For example:

https://lore.kernel.org/lkml/Y6mXDUZkII5OnuE8@tpad/T/

[PATCH] hwmon: coretemp: avoid RDMSR interruptions to isolated CPUs

The coretemp driver uses rdmsr_on_cpu calls to read
MSR_IA32_PACKAGE_THERM_STATUS/MSR_IA32_THERM_STATUS registers,
which contain information about current core temperature.

For certain low latency applications, the RDMSR interruption exceeds    
the applications requirements.

So disable reading of crit_alarm and temp files via /sys, in case
CPU isolation is enabled.

Temperature information from the housekeeping cores should be
sufficient to infer die temperature.

Signed-off-by: Marcelo Tosatti <mtosatti@...hat.com>

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 9bee4d33fbdf..30a35f4130d5 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -27,6 +27,7 @@
 #include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/cpu_device_id.h>
+#include <linux/sched/isolation.h>
 
 #define DRVNAME	"coretemp"
 
@@ -121,6 +122,10 @@ static ssize_t show_crit_alarm(struct device *dev,
 	struct platform_data *pdata = dev_get_drvdata(dev);
 	struct temp_data *tdata = pdata->core_data[attr->index];
 
+
+	if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC))
+		return -EINVAL;
+
 	mutex_lock(&tdata->update_lock);
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 	mutex_unlock(&tdata->update_lock);
@@ -158,6 +163,8 @@ static ssize_t show_temp(struct device *dev,
 
 	/* Check whether the time interval has elapsed */
 	if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) {
+		if (!housekeeping_cpu(tdata->cpu, HK_TYPE_MISC))
+			return -EINVAL;
 		rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 		/*
 		 * Ignore the valid bit. In all observed cases the register


In this case, a userspace application (collecting telemetry data), was
responsible for reading the sysfs files.

Now think of all possible paths, from userspace, that lead to kernel
code that ends up in smp_call_function_* variants (or other functions
that cause IPIs to isolated CPUs).

The alternative, from blocking this in the kernel, would be to validate all 
userspace software involved in your application, to ensure it won't end
up in the kernel sending IPIs. Which is impractical, isnt it ?
(or rather, with such option in the kernel, it would be possible to run 
applications which have not been validated, since the kernel would fail
the operation that results in IPI to isolated CPU).

So the idea would be an additional "isolation mode", which when enabled, 
would disallow the IPIs. Its still possible for root user to disable
this mode, and retry the operation.

So lets say i want to read MSRs on a given CPU, as root.

You'd have to: 

1) readmsr on given CPU (returns -EPERM or whatever), since the
"block interference" mode is enabled for that CPU.

2) Disable that CPU in the block interference cpumask.

3) readmsr on the given CPU (success).

4) Re-enable CPU in block interference cpumask, if desired.


(BTW, better ideas than the cpumask are welcome, but it seems the
realibility aspect of something similar to this is necessary).

Thanks!


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ