lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18729cf6-bf3a-4a11-a9fc-a35792cd1736@linux.intel.com>
Date: Sun, 28 Jul 2024 19:16:44 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>, Yan Zhao <yan.y.zhao@...el.com>
Cc: Ackerley Tng <ackerleytng@...gle.com>, sagis@...gle.com,
 linux-kselftest@...r.kernel.org, afranji@...gle.com, erdemaktas@...gle.com,
 isaku.yamahata@...el.com, pbonzini@...hat.com, shuah@...nel.org,
 pgonda@...gle.com, haibo1.xu@...el.com, chao.p.peng@...ux.intel.com,
 vannapurve@...gle.com, runanwang@...gle.com, vipinsh@...gle.com,
 jmattson@...gle.com, dmatlack@...gle.com, linux-kernel@...r.kernel.org,
 kvm@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH v5 09/29] KVM: selftests: TDX: Add report_fatal_error
 test



On 4/23/2024 5:23 AM, Sean Christopherson wrote:
> On Thu, Apr 18, 2024, Yan Zhao wrote:
>> On Tue, Apr 16, 2024 at 11:50:19AM -0700, Sean Christopherson wrote:
>>> On Mon, Apr 15, 2024, Yan Zhao wrote:
>>>> On Mon, Apr 15, 2024 at 08:05:49AM +0000, Ackerley Tng wrote:
>>>>>>> The Intel GHCI Spec says in R12, bit 63 is set if the GPA is valid. As a
>>>>>> But above "__LINE__" is obviously not a valid GPA.
>>>>>>
>>>>>> Do you think it's better to check "data_gpa" is with shared bit on and
>>>>>> aligned in 4K before setting bit 63?
>>>>>>
>>>>> I read "valid" in the spec to mean that the value in R13 "should be
>>>>> considered as useful" or "should be passed on to the host VMM via the
>>>>> TDX module", and not so much as in "validated".
>>>>>
>>>>> We could validate the data_gpa as you suggested to check alignment and
>>>>> shared bit, but perhaps that could be a higher-level function that calls
>>>>> tdg_vp_vmcall_report_fatal_error()?
>>>>>
>>>>> If it helps, shall we rename "data_gpa" to "data" for this lower-level,
>>>>> generic helper function and remove these two lines
>>>>>
>>>>> if (data_gpa)
>>>>> 	error_code |= 0x8000000000000000;
>>>>>
>>>>> A higher-level function could perhaps do the validation as you suggested
>>>>> and then set bit 63.
>>>> This could be all right. But I'm not sure if it would be a burden for
>>>> higher-level function to set bit 63 which is of GHCI details.
>>>>
>>>> What about adding another "data_gpa_valid" parameter and then test
>>>> "data_gpa_valid" rather than test "data_gpa" to set bit 63?
>>> Who cares what the GHCI says about validity?  The GHCI is a spec for getting
>>> random guests to play nice with random hosts.  Selftests own both, and the goal
>>> of selftests is to test that KVM (and KVM's dependencies) adhere to their relevant
>>> specs.  And more importantly, KVM is NOT inheriting the GHCI ABI verbatim[*].
>>>
>>> So except for the bits and bobs that *KVM* (or the TDX module) gets involved in,
>>> just ignore the GHCI (or even deliberately abuse it).  To put it differently, use
>>> selftests verify *KVM's* ABI and functionality.
>>>
>>> As it pertains to this thread, while I haven't looked at any of this in detail,
>>> I'm guessing that whether or not bit 63 is set is a complete "don't care", i.e.
>>> KVM and the TDX Module should pass it through as-is.
>>>
>>> [*] https://lore.kernel.org/all/Zg18ul8Q4PGQMWam@google.com
>> Ok. It makes sense to KVM_EXIT_TDX.
>> But what if the TDVMCALL is handled in TDX specific code in kernel in future?
>> (not possible?)
> KVM will "handle" ReportFatalError, and will do so before this code lands[*], but
> I *highly* doubt KVM will ever do anything but forward the information to userspace,
> e.g. as KVM_SYSTEM_EVENT_CRASH with data[] filled in with the raw register information.
>
>> Should guest set bits correctly according to GHCI?
> No.  Selftests exist first and foremost to verify KVM behavior, not to verify
> firmware behavior.  We can and should use selftests to verify that *KVM* doesn't
> *violate* the GHCI, but that doesn't mean that selftests themselves can't ignore
> and/or abuse the GCHI, especially since the GHCI definition for ReportFatalError
> is frankly awful.
>
> E.g. the GHCI prescibes actual behavior for R13, but then doesn't say *anything*
> about what's in the data page.  Why!?!?!  If the format in the data page is
> completely undefined, what's the point of restricting R13 to only be allowed to
> hold a GPA?

The description of R13 in GHCI:
   4KB-aligned GPA where additional error data is shared by the TD. The
   VMM must validate that this GPA has the Shared bit set. In other words,
   that a shared-mapping is used, and that this is a valid mapping for the
   TD. This shared memory region is expected to hold a zero-terminated
   string.

IIUC, according the GHCI, R13 is a 4K aligned shared buffer provided by
the TDX guest to pass additional error message to VMM, i.e., it needs to
be a shared GPA.  And the content in the buffer is expected to hold a
zero-terminated string.

Do you think "a zero-terminated string" describes the format in the data
page?


>
> And the wording is just as awful:
>
>    The VMM must validate that this GPA has the Shared bit set. In other words,
>    that a shared-mapping is used, and that this is a valid mapping for the TD.
>
> I'm pretty sure it's just saying that the TDX module isn't going to verify the
> operate, i.e. that the VMM needs to protect itself, but it would be so much
> better to simply state "The TDX Module does not verify this GPA", because saying
> the VMM "must" do something leads to pointless discussions like this one, where
> we're debating over whether or *our* VMM should inject an error into *our* guest.
>
> Anyways, we should do what makes sense for selftests and ignore the stupidity of
> the GHCI when doing so yields better code.  If that means abusing R13, go for it.
> If it's a sticking point for anyone, just use one of the "optional" registers.
>
> Whatever we do, bury the host and guest side of selftests behind #defines or helpers
> so that there are at most two pieces of code that care which register holds which
> piece of information.
>
> [*] https://lore.kernel.org/all/20240404230247.GU2444378@ls.amr.corp.intel.com
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ