linux-kernel - Re: [PATCH v7 06/20] x86/virt/tdx: Shut down TDX module in case of error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b3938f3a-e4f8-675a-0c0e-4b4618019145@intel.com>
Date:   Tue, 22 Nov 2022 11:24:48 -0800
From:   Dave Hansen <dave.hansen@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Kai Huang <kai.huang@...el.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, linux-mm@...ck.org, seanjc@...gle.com,
        pbonzini@...hat.com, dan.j.williams@...el.com,
        rafael.j.wysocki@...el.com, kirill.shutemov@...ux.intel.com,
        ying.huang@...el.com, reinette.chatre@...el.com,
        len.brown@...el.com, tony.luck@...el.com, ak@...ux.intel.com,
        isaku.yamahata@...el.com, chao.gao@...el.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com, bagasdotme@...il.com,
        sagis@...gle.com, imammedo@...hat.com
Subject: Re: [PATCH v7 06/20] x86/virt/tdx: Shut down TDX module in case of
 error

On 11/22/22 11:13, Peter Zijlstra wrote:
> On Tue, Nov 22, 2022 at 07:14:14AM -0800, Dave Hansen wrote:
>> On 11/22/22 01:13, Peter Zijlstra wrote:
>>> On Mon, Nov 21, 2022 at 01:26:28PM +1300, Kai Huang wrote:
>>>> +/*
>>>> + * Call the SEAMCALL on all online CPUs concurrently.  Caller to check
>>>> + * @sc->err to determine whether any SEAMCALL failed on any cpu.
>>>> + */
>>>> +static void seamcall_on_each_cpu(struct seamcall_ctx *sc)
>>>> +{
>>>> +	on_each_cpu(seamcall_smp_call_function, sc, true);
>>>> +}
>>>
>>> Suppose the user has NOHZ_FULL configured, and is already running
>>> userspace that will terminate on interrupt (this is desired feature for
>>> NOHZ_FULL), guess how happy they'll be if someone, on another parition,
>>> manages to tickle this TDX gunk?
>>
>> Yeah, they'll be none too happy.
>>
>> But, what do we do?
> 
> Not intialize TDX on busy NOHZ_FULL cpus and hard-limit the cpumask of
> all TDX using tasks.

I don't think that works.  As I mentioned to Thomas elsewhere, you don't
just need to initialize TDX on the CPUs where it is used.  Before the
module will start working you need to initialize it on *all* the CPUs it
knows about.  The module itself has a little counter where it tracks
this and will refuse to start being useful until it gets called
thoroughly enough.

>> There are technical solutions like detecting if NOHZ_FULL is in play and
>> refusing to initialize TDX.  There are also non-technical solutions like
>> telling folks in the documentation that they better modprobe kvm early
>> if they want to do TDX, or their NOHZ_FULL apps will pay.
> 
> Surely modprobe kvm isn't the point where TDX gets loaded? Because
> that's on boot for everybody due to all the auto-probing nonsense.
> 
> I was expecting TDX to not get initialized until the first TDX using KVM
> instance is created. Am I wrong?

I went looking for it in this series to prove you wrong.  I failed.  :)

tdx_enable() is buried in here somewhere:

> https://lore.kernel.org/lkml/CAAhR5DFrwP+5K8MOxz5YK7jYShhaK4A+2h1Pi31U_9+Z+cz-0A@mail.gmail.com/T/

I don't have the patience to dig it out today, so I guess we'll have Kai
tell us.

>> We could also force the TDX module to be loaded early in boot before
>> NOHZ_FULL is in play, but that would waste memory on TDX metadata even
>> if TDX is never used.
> 
> I'm thikning it makes sense to have a tdx={off,on-demand,force} toggle
> anyway.

Yep, that makes total sense.  Kai had one in an earlier version but I
made him throw it out because it wasn't *strictly* required and this set
is fat enough.

>> How do NOHZ_FULL folks deal with late microcode updates, for example?
>> Those are roughly equally disruptive to all CPUs.
> 
> I imagine they don't do that -- in fact I would recommend we make the
> whole late loading thing mutually exclusive with nohz_full; can't have
> both.

So, if we just use schedule_on_cpu() for now and have the TDX code wait,
will a NOHZ_FULL task just block the schedule_on_cpu() indefinitely?

That doesn't seem like _horrible_ behavior to start off with for a
minimal series.