lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 24 Feb 2020 20:25:36 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Borislav Petkov <bp@...en8.de>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc:     tony.luck@...el.com, mingo@...hat.com, hpa@...or.com,
        x86@...nel.org, linux-edac@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Chris Wilson <chris@...is-wilson.co.uk>
Subject: Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU

Thomas Gleixner <tglx@...utronix.de> writes:
> Which is wrong as well. Trying to "fix" it in the work queue callback is
> papering over the root cause.
>
> Why is any work scheduled on an outgoing CPU after this CPU executed
> thermal_throttle_offline()?
>
> When thermal_throttle_offline() is invoked the cpu bound work queues are
> still functional and thermal_throttle_offline() cancels outstanding
> work.
>
> So no, please fix the root cause not the symptom.

And if you look at thermal_throttle_online() then you'll notice that it
is asymetric vs. thermal_throttle_offline().

Also you want to do cancel_delayed_work_sync() and not just
cancel_delayed_work() because only the latter guarantees that the work
is not enqueued anymore while the former does not take running or self
requeueing work into account.

Something like the untested patch below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/mce/therm_throt.c
+++ b/arch/x86/kernel/cpu/mce/therm_throt.c
@@ -487,8 +487,12 @@ static int thermal_throttle_offline(unsi
 	struct thermal_state *state = &per_cpu(thermal_state, cpu);
 	struct device *dev = get_cpu_device(cpu);
 
-	cancel_delayed_work(&state->package_throttle.therm_work);
-	cancel_delayed_work(&state->core_throttle.therm_work);
+	/* Mask the thermal vector before draining evtl. pending work */
+	l = apic_read(APIC_LVTTHMR);
+	apic_write(APIC_LVTTHMR, l | APIC_LVT_MASKED);
+
+	cancel_delayed_work_sync(&state->package_throttle.therm_work);
+	cancel_delayed_work_sync(&state->core_throttle.therm_work);
 
 	state->package_throttle.rate_control_active = false;
 	state->core_throttle.rate_control_active = false;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ