lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 03 Sep 2021 07:38:41 -0700
From:   Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
To:     Jens Axboe <axboe@...nel.dk>, LKML <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Len Brown <lenb@...nel.org>, inux-pm@...r.kernel.org
Subject: Re: Bug: d0e936adbd22 crashes at boot

On Fri, 2021-09-03 at 08:15 -0600, Jens Axboe wrote:
> On 9/3/21 8:13 AM, Srinivas Pandruvada wrote:
> > Hi Axboe,
> > 
> > Thanks for reporting.
> > On Fri, 2021-09-03 at 07:36 -0600, Jens Axboe wrote:
> > > Hi,
> > > 
> > > Booting Linus's tree causes a crash on my laptop, an x1 gen9. This
> > > was
> > > a bit
> > > difficult to pin down as it crashes before the display is up, but I
> > > managed
> > > to narrow it down to:
> > > 
> > > commit d0e936adbd2250cb03f2e840c6651d18edc22ace
> > > Author: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> > > Date:   Thu Aug 19 19:40:06 2021 -0700
> > > 
> > >     cpufreq: intel_pstate: Process HWP Guaranteed change
> > > notification
> > > 
> > > which crashes with a NULL pointer deref in notify_hwp_interrupt() -
> > > >
> > > queue_delayed_work_on().
> > > 
> > > Reverting this change makes the laptop boot fine again.
> > > 
> > Does this change fixes your issue?
> 
> I would assume so, as it's crashing on cpudata == NULL :-)
> 
> But why is it NULL? Happy to test patches, but the below doesn't look
> like
> a real fix and more of a work-around.

This platform is sending an HWP interrupt on a CPU which we didn't yet
bring it up for pstate control. So somehow firmware decided to send
very early during boot, which previously we would have ignored it

Actually try this, with more prevention

diff --git a/drivers/cpufreq/intel_pstate.c
b/drivers/cpufreq/intel_pstate.c
index b4ffe6c8a0d0..6ee88d7640ea 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1645,12 +1645,24 @@ void notify_hwp_interrupt(void)
        if (!hwp_active || !boot_cpu_has(X86_FEATURE_HWP_NOTIFY))
                return;
 
-       rdmsrl(MSR_HWP_STATUS, value);
+       rdmsrl_safe(MSR_HWP_STATUS, &value);
        if (!(value & 0x01))
                return;
 
+       /*
+        * After hwp_active is set and all_cpu_data is allocated, there
+        * is small window.
+        */
+       if (!all_cpu_data) {
+               wrmsrl_safe(MSR_HWP_STATUS, 0);
+               return;
+       }
+
        cpudata = all_cpu_data[this_cpu];
-       schedule_delayed_work_on(this_cpu, &cpudata->hwp_notify_work,
msecs_to_jiffies(10));
+       if (cpudata)
+               schedule_delayed_work_on(this_cpu, &cpudata-
>hwp_notify_work, msecs_to_jiffies(10));
+       else
+               wrmsrl_safe(MSR_HWP_STATUS, 0);
 }
 


> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ