linux-kernel - Re: [RFT PATCH v3 00/21] x86: strict separation of startup code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250513141633.GDaCNUQdRl6ci2zK5T@fat_crate.local>
Date: Tue, 13 May 2025 16:16:33 +0200
From: Borislav Petkov <bp@...en8.de>
To: Ingo Molnar <mingo@...nel.org>
Cc: Ard Biesheuvel <ardb+git@...gle.com>, linux-kernel@...r.kernel.org,
	linux-efi@...r.kernel.org, x86@...nel.org,
	Ard Biesheuvel <ardb@...nel.org>,
	Dionna Amalie Glaze <dionnaglaze@...gle.com>,
	Kevin Loughlin <kevinloughlin@...gle.com>,
	Tom Lendacky <thomas.lendacky@....com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFT PATCH v3 00/21] x86: strict separation of startup code

On Tue, May 13, 2025 at 01:22:16PM +0200, Ingo Molnar wrote:
> Yeah, so the problem is that SEV* is hardware that basically no active 
> tester outside of the vendor (AMD) owns and is testing against 
> development trees AFAICS.

I don't think you even know what you're talking about. Hell, I only recently
gave you the 101 on SEV because you asked me on IRC.

So no, not even close. If you had bothered to ask a search engine of your
choosing, you would've known better.

All the cloud vendors have it and we are getting bug reports and fixes from
them and everyone else that's using it. You could've seen that by doing a git
log on the SEV files in the kernel even.

> I did a quick Git search, and here are a few examples:
> 
> For example, this commit from last summer:
> 
>   6c3211796326 ("x86/sev: Add SNP-specific unaccepted memory support")
> 
> ... was only fixed recently:
> 
>   d54d610243a4 ("x86/boot/sev: Avoid shared GHCB page for early memory acceptance")

Because we pretty much test with huge-page aligned memory sizes...  memory
acceptance tracks pages at the 2mb level. It will accept memory if there is an
unaccepted memory EFI entry that isn't 2mb aligned at the start or end.

So when you have a 4G guest or 16G guest you don't have that. If you specify
4095M for the guest memory, then it will trigger. And since that was done
before SEV was initialized (at least in the EFI stub path, not the
decompressor path) things just didn't work.
 
> Or this commit from June 2024:
> 
>   34ff65901735 ("x86/sev: Use kernel provided SVSM Calling Areas")
> 
> ... was only fixed a few days ago:
> 
>   f7387eff4bad ("x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL
>   macro")

That was "fixed" because we never had to run a multi-VMPL level setup in Linux
yet as we run Linux guests differently with that respect. So we couldn't have
hit it even if we wanted to.

And even in the SVSM testing, Linux never requests a non-zero VMPL, and so it
wasn't caught during testing. Linux will always request VMPL0.

There is a lot of testing of the guest code with Coconut-SVSM and it is
a scenario that doesn't exist.

> Or this commit from June 2024:
> 
>   fcd042e86422 ("x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0")
> 
> ... was fixed a few weeks ago:
> 
>   8ed12ab1319b ("x86/boot/sev: Support memory acceptance in the EFI stub under SVSM")

That's a fix for the above fix d54d610243a4 which relates to the multiple VMPL
thing which we don't do (yet) in Linux.

> Ie. bugfix latencies here were 10+ months.

While doing your git search, did you check my reaction time when fixes are
sent too?

Or you decided to find some random patches with Fixes: tags pointing to SEV
code?

Or are you saying our crystal ball of what's broken is not working fast
enough?

Lemme see:

https://lore.kernel.org/r/c0af2efa-aea4-43aa-b1da-46ac4c50314b@amd.com

This is only the latest test report I requested.

So no, we test the hell of it and as much as possible. What you claim here is
dumbfounded and completely false.

> Note that two of those fixes were from Ard who is working on further 
> robustifying the startup code - a much needed change.

Really? Much needed huh?

Please do explain why is it much needed?

Because the reason Ard is doing it is a different one but maybe
I misunderstood him...

> Ie. when Ard is asking for SEV-SNP testing for WIP series, which he did 
> 10+ days ago, you should not ignore it

More unfounded claims. Here's Tom and me ignoring it:

https://lore.kernel.org/r/836eb6be-926b-dfb4-2c67-f55cba4a072b@amd.com
https://lore.kernel.org/r/20250507095801.GNaBsuqd7m15z0kHji@fat_crate.local
https://lore.kernel.org/r/20250508110800.GBaByQkJwmZlihk6Xp@fat_crate.local
https://lore.kernel.org/r/f4750413-a2e6-15c4-7fa5-2595b509500b@amd.com
https://lore.kernel.org/r/20250505160346.GJaBjhYp09sLZ5AyyJ@fat_crate.local
https://lore.kernel.org/r/20250505164759.GKaBjrv5SI4MX_NiX-@fat_crate.local

Nah, this is not ignoring - this is Tom and me rushing to see whether
something broke because *you* applied stuff on the same day without waiting
for any review!

This is basically you doing whatever the hell you like and not even asking
people working on that code.

And you completely ignored my requests on IRC to wait with that code a bit so
that we can take a look.

> ... or if you do ignore his request for testing, you should not complain
> about the changes being merged eventually, once they pass review & testing
> on non-SEV platforms.

What review?

Show me.

Commit-ID:     bd4a58beaaf1f4aff025282c6e8b130bdb4a29e4
Gitweb:        https://git.kernel.org/tip/bd4a58beaaf1f4aff025282c6e8b130bdb4a29e4
Author:        Ard Biesheuvel <ardb@...nel.org>
AuthorDate:    Sun, 04 May 2025 11:52:31 +02:00
Committer:     Ingo Molnar <mingo@...nel.org>
CommitterDate: Sun, 04 May 2025 15:27:23 +02:00

What review do you expect to see in 3 hours on a Sunday?!?!

> If you didn't have time to personally test Ard's -v2 series since May 
> 2, that's OK

It was May 4th, it was a Sunday. And you can see my replies on the next
days.

> I can merge these proposed changes in an RFT branch so that it gets tested

How about you wait first for those patches to be reviewed like every other
patchset on lkml and then take them?

I mean, normal review process. Remember?

The thing we all are supposed to do...

> In other words: please no "gatekeeping".

Sorry, zero gatekeeping here.

> Please don't force Ard into a catch-22 situation where he cannot test the
> patches on SEV-SNP, but you are blocking these x86 startup code changes on
> the grounds that they weren't tested on SEV-SNP ...

I'm helping Ard to get and setup a SEV machine. And I'm testing too.

If you had asked, you would've learned all that but you haven't.

> This request for testing was ignored AFAICS.

Review, then test. As always. You know that.

I keep telling you that but I don't think you're hearing me - you just do
whatever the hell you like.

> Sure: -v2 was sent more than 10 days ago, and the testing request was 
> ignored AFAICS. Do 10 days count as 'ample time'?

I am reviewing. I can't just drop everything and concentrate only on this.

Hell, that set is not even ready yet:

https://lore.kernel.org/r/CAMj1kXH5C6FzMyrki_23TTk_Yma5NJdHTo-nv4DmZoz_qaGbVQ@mail.gmail.com

Now you tell me: why are *YOU* in such a hurry with that set?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette