lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240924024551.GA13538@ranerica-svr.sc.intel.com>
Date: Mon, 23 Sep 2024 19:45:51 -0700
From: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To: "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
Cc: "Zhang, Rui" <rui.zhang@...el.com>,
	"regressions@...mhuis.info" <regressions@...mhuis.info>,
	"Neri, Ricardo" <ricardo.neri@...el.com>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"bp@...en8.de" <bp@...en8.de>,
	"Gupta, Pawan Kumar" <pawan.kumar.gupta@...el.com>,
	"regressions@...ts.linux.dev" <regressions@...ts.linux.dev>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Luck, Tony" <tony.luck@...el.com>,
	"thomas.lindroth@...il.com" <thomas.lindroth@...il.com>,
	"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu()
 change in v6.1.96

On Thu, Sep 19, 2024 at 01:19:27PM +0200, gregkh@...uxfoundation.org wrote:
> On Wed, Sep 18, 2024 at 06:54:33AM +0000, Zhang, Rui wrote:
> > On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote:
> > > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis wrote:
> > > > [CCing the x86 folks, Greg, and the regressions list]
> > > > 
> > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > 
> > > > On 30.07.24 18:41, Thomas Lindroth wrote:
> > > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and
> > > > > noticed that
> > > > > the dmesg line "Incomplete global flushes, disabling PCID" had
> > > > > disappeared from
> > > > > the log.
> > > > 
> > > > Thomas, thx for the report. FWIW, mainline developers like the x86
> > > > folks
> > > > or Tony are free to focus on mainline and leave stable/longterm
> > > > series
> > > > to other people -- some nevertheless help out regularly or
> > > > occasionally.
> > > > So with a bit of luck this mail will make one of them care enough
> > > > to
> > > > provide a 6.1 version of what you afaics called the "existing fix"
> > > > in
> > > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU model
> > > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But if
> > > > not I
> > > > suspect it might be up to you to prepare and submit a 6.1.y variant
> > > > of
> > > > that fix, as you seem to care and are able to test the patch.
> > > 
> > > Needs to go to 6.6.y first, right?  But even then, it does not apply
> > > to
> > > 6.1.y cleanly, so someone needs to send a backported (and tested)
> > > series
> > > to us at stable@...r.kernel.org and we will be glad to queue them up
> > > then.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > There are three commits involved.
> > 
> > commit A:
> >    4db64279bc2b (""x86/cpu: Switch to new Intel CPU model defines"") 
> >    This commit replaces
> >       X86_MATCH_INTEL_FAM6_MODEL(ANY, 1),             /* SNC */
> >    with
> >       X86_MATCH_VFM(INTEL_ANY,         1),    /* SNC */
> >    This is a functional change because the family info is replaced with
> > 0. And this exposes a x86_match_cpu() problem that it breaks when the
> > vendor/family/model/stepping/feature fields are all zeros.
> > 
> > commit B:
> >    93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just
> > X86_VENDOR_INTEL")
> >    It addresses the x86_match_cpu() problem by introducing a valid flag
> > and set the flag in the Intel CPU model defines.
> >    This fixes commit A, but it actually breaks the x86_cpu_id
> > structures that are constructed without using the Intel CPU model
> > defines, like arch/x86/mm/init.c.
> > 
> > commit C:
> >    2eda374e883a ("x86/mm: Switch to new Intel CPU model defines")
> >    arch/x86/mm/init.c: broke by commit B but fixed by using the new
> > Intel CPU model defines
> > 
> > In 6.1.99,
> > commit A is missing
> > commit B is there
> > commit C is missing
> > 
> > In 6.6.50,
> > commit A is missing
> > commit B is there
> > commit C is missing
> > 
> > Now we can fix the problem in stable kernel, by converting
> > arch/x86/mm/init.c to use the CPU model defines (even the old style
> > ones). But before that, I'm wondering if we need to backport commit B
> > in 6.1 and 6.6 stable kernel because only commit A can expose this
> > problem.
> 
> If so, can you submit the needed backports for us to apply?  That's the
> easiest way for us to take them, thanks.

I audited all the uses of x86_match_cpu(match). All callers that construct
the `match` argument using the family of X86_MATCH_* macros from arch/x86/
include/asm/cpu_device_id.h function correctly because the commit B has
been backported to v6.1.99 and to v6.6.50 -- 93022482b294 ("x86/cpu: Fix
x86_match_cpu() to match just X86_VENDOR_INTEL").

Only those callers that use their own thing to compose the `match` argument
are buggy:
    * arch/x86/mm/init.c
    * drivers/powercap/intel_rapl_msr.c (only in 6.1.99)

Summarizing, v6.1.99 needs these two commits from mainline
    * d05b5e0baf42 ("powercap: RAPL: fix invalid initialization for
      pl4_supported field")
    * 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines")

v6.6.50 only needs the second commit.

I will submit these backports.

Thanks and BR,
Ricardo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ