lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZgFY24QT7470ZGnV@gmail.com>
Date: Mon, 25 Mar 2024 11:58:35 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Russ Anderson <rja@....com>
Cc: Steve Wahl <steve.wahl@....com>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Andy Lutomirski <luto@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
	linux-kernel@...r.kernel.org,
	Linux regressions mailing list <regressions@...ts.linux.dev>,
	Pavin Joseph <me@...injoseph.com>, stable@...r.kernel.org,
	Eric Hagberg <ehagberg@...il.com>,
	Simon Horman <horms@...ge.net.au>,
	Eric Biederman <ebiederm@...ssion.com>,
	Dave Young <dyoung@...hat.com>, Sarah Brofeldt <srhb@....dk>,
	Dimitri Sivanich <sivanich@....com>
Subject: Re: [PATCH] x86/mm/ident_map: Use full gbpages in identity maps
 except on UV platform.


* Russ Anderson <rja@....com> wrote:

> On Sun, Mar 24, 2024 at 11:31:39AM +0100, Ingo Molnar wrote:
> > 
> > * Steve Wahl <steve.wahl@....com> wrote:
> > 
> > > Some systems have ACPI tables that don't include everything that needs
> > > to be mapped for a successful kexec.  These systems rely on identity
> > > maps that include the full gigabyte surrounding any smaller region
> > > requested for kexec success.  Without this, they fail to kexec and end
> > > up doing a full firmware reboot.
> > > 
> > > So, reduce the use of GB pages only on systems where this is known to
> > > be necessary (specifically, UV systems).
> > > 
> > > Signed-off-by: Steve Wahl <steve.wahl@....com>
> > > Fixes: d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> > > Reported-by: Pavin Joseph <me@...injoseph.com>
> > 
> > Sigh, why was d794734c9bbf marked for a -stable backport? The commit 
> > never explains ...
> 
> I will try to explain, since Steve is offline.  That commit fixes a
> legitimate bug where more address range is mapped (1G) than the
> requested address range.

If a change regresses on certain machines then it's not a bug fix anymore, 
it's a regression. End of story.

>  The fix avoids the issue of cpu speculativly
> loading beyond the requested range, which inludes specutalive loads
> from reserved memory.  That is why it was marked for -stable.

And this regression is why more complicated fixes in this area should not 
be forwarded to -stable before it's been merged upstream and exposed a bit 
more. Please keep that in mind for future iterations.

> > If it's broken, it should be reverted - instead of trying to partially 
> > revert and then maybe break some other systems.
> 
> Three people reported that mapping only the correct address range
> caused problems on their platforms.  https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph.com/
> Steve and several people helped debug the issue.  The commit itself
> looks correct but the correct behavior causes some side effect on
> a few platforms.

That's all fine and the effort is much appreciated - but we should not try 
to whitewash a regression: if there's a couple of reports in such a short 
time already, then the regression is significant.

Anyway, I've reverted this in tip:x86/urgent:

  c567f2948f57 Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."

we can iterate from there again. Please post future patches against that 
tree.

Note that this is just the regular development process: regressions happen, 
and this is how we handle them a lot of the time in this area - we back out 
the breakage, then try again.

> Some memory ends up not being mapped, but it is not clear if it is due to 
> some other bug, such as bios not accurately providing the right memory 
> map or some other kernel code path did not map what it should.  The 1G 
> mapping covers up that type issue.
> 
> Steve's second patch was to not break those platforms while leaving the 
> fix on the platform detected the original mapping problem (UV platform).
> 
> > When there's boot breakage with new patches, we back out the bad patch 
> > and re-try in 99.9% of the cases.
> 
> Steve can certainly merge his two patches and resubmit, to replace the 
> reverted original patch.  He should be on in the morning to speak for 
> himself.

Thank you!

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ