[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <158239041049.15220.2836895127344585201@skylake-alporthouse-com>
Date: Sat, 22 Feb 2020 16:53:30 +0000
From: Chris Wilson <chris@...is-wilson.co.uk>
To: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
bp@...en8.de, hpa@...or.com, mingo@...hat.com, tglx@...utronix.de,
tony.luck@...el.com
Cc: x86@...nel.org, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Subject: Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is
called on behalf of an offline CPU
Quoting Srinivas Pandruvada (2020-02-22 16:24:32)
> During cpu-hotplug test with CONFIG_PREEMPTION and CONFIG_DEBUG_PREEMPT
> enabled, Chris reported error:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: kworker/1:0/17
> caller is throttle_active_work+0x12/0x280
>
> Here throttle_active_work() is a work queue callback scheduled with
> schedule_delayed_work_on(). This will not cause this error for the use
> of smp_processor_id() under normal conditions as there is a check for
> "current->nr_cpus_allowed == 1".
> But when the target CPU is offline the workqueue becomes unbound.
> Then the work queue callback can be scheduled on another CPU and the
> error is printed for the use of smp_processor_id() in preemptible context.
>
> When the workqueue is not getting called on the target CPU, simply return.
> This is done by adding a cpu field in the _thermal_state struct and match
> the current CPU id.
>
> Once workqueue is scheduled, prevent CPU offline. In this way, the log
> bits are checked and cleared on the correct CPU. Also use get_cpu() to
> get current CPU id and prevent preemption before we finish processing.
>
> Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
> Reported-by: Chris Wilson <chris@...is-wilson.co.uk>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@...el.com>
I've pushed the patch to our CI, but it's not a frequent occurrence, so
it may be some time before I can state a t-b with any confidence.
-Chris
Powered by blists - more mailing lists