lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 22 Feb 2020 16:53:30 +0000
From:   Chris Wilson <chris@...is-wilson.co.uk>
To:     Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        bp@...en8.de, hpa@...or.com, mingo@...hat.com, tglx@...utronix.de,
        tony.luck@...el.com
Cc:     x86@...nel.org, linux-edac@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Subject: Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is
 called on behalf of an offline CPU

Quoting Srinivas Pandruvada (2020-02-22 16:24:32)
> During cpu-hotplug test with CONFIG_PREEMPTION and CONFIG_DEBUG_PREEMPT
> enabled, Chris reported error:
> 
> BUG: using smp_processor_id() in preemptible [00000000] code: kworker/1:0/17
> caller is throttle_active_work+0x12/0x280
> 
> Here throttle_active_work() is a work queue callback scheduled with
> schedule_delayed_work_on(). This will not cause this error for the use
> of smp_processor_id() under normal conditions as there is a check for
> "current->nr_cpus_allowed == 1".
> But when the target CPU is offline the workqueue becomes unbound.
> Then the work queue callback can be scheduled on another CPU and the
> error is printed for the use of smp_processor_id() in preemptible context.
> 
> When the workqueue is not getting called on the target CPU, simply return.
> This is done by adding a cpu field in the _thermal_state struct and match
> the current CPU id.
> 
> Once workqueue is scheduled, prevent CPU offline. In this way, the log
> bits are checked and cleared on the correct CPU. Also use get_cpu() to
> get current CPU id and prevent preemption before we finish processing.
> 
> Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
> Reported-by: Chris Wilson <chris@...is-wilson.co.uk>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@...el.com>

I've pushed the patch to our CI, but it's not a frequent occurrence, so
it may be some time before I can state a t-b with any confidence.
-Chris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ