[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230117214414.00003229@gmail.com>
Date: Tue, 17 Jan 2023 21:44:14 +0200
From: Zhi Wang <zhi.wang.linux@...il.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: isaku.yamahata@...el.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, isaku.yamahata@...il.com,
Paolo Bonzini <pbonzini@...hat.com>, erdemaktas@...gle.com,
Sagi Shahar <sagis@...gle.com>,
David Matlack <dmatlack@...gle.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Kai Huang <kai.huang@...el.com>
Subject: Re: [PATCH v11 018/113] KVM: TDX: create/destroy VM structure
On Tue, 17 Jan 2023 15:55:53 +0000
Sean Christopherson <seanjc@...gle.com> wrote:
> On Sat, Jan 14, 2023, Zhi Wang wrote:
> > On Fri, 13 Jan 2023 15:16:08 +0000 > Sean Christopherson <seanjc@...gle.com> wrote:
> >
> > > On Fri, Jan 13, 2023, Zhi Wang wrote:
> > > > Better add a FIXME: here as this has to be fixed later.
> > >
> > > No, leaking the page is all KVM can reasonably do here. An improved
> > > comment would be helpful, but no code change is required.
> > > tdx_reclaim_page() returns an error if and only if there's an
> > > unexpected, fatal error, e.g. a SEAMCALL with bad params, incorrect
> > > concurrency in KVM, a TDX Module bug, etc. Retrying at a later point is
> > > highly unlikely to be successful.
> >
> > Hi:
> >
> > The word "leaking" sounds like a situation left unhandled temporarily.
> >
> > I checked the source code of the TDX module[1] for the possible reason to
> > fail when reviewing this patch:
> >
> > tdx-module-v1.0.01.01.zip\src\vmm_dispatcher\api_calls\tdh_phymem_page_reclaim.c
> > tdx-module-v1.0.01.01.zip\src\vmm_dispatcher\api_calls\tdh_phymem_page_wbinvd.c
> >
> > a. Invalid parameters. For example, page is not aligned, PA HKID is not zero...
> >
> > For invalid parameters, a WARN_ON_ONCE() + return value is good enough as
> > that is how kernel handles similar situations. The caller takes the
> > responsibility.
> >
> > b. Locks has been taken in TDX module. TDR page has been locked due to another
> > SEAMCALL, another SEAMCALL is doing PAMT walk and holding PAMT lock...
> >
> > This needs to be improved later either by retry or taking tdx_lock to avoid
> > TDX module fails on this.
>
> No, tdx_reclaim_page() already retries TDH.PHYMEM.PAGE.RECLAIM if the target page
> is contended (though I'd question the validity of even that), and TDH.PHYMEM.PAGE.WBINVD
> is performed only when reclaiming the TDR. If there's contention when reclaiming
> the TDR, then KVM effectively has a use-after-free bug, i.e. leaking the page is
> the least of our worries.
>
Hi:
Thanks for the reply. "Leaking" is the consquence of even failing in retry. I
agree with this. But I was questioning if "retry" is really a correct and only
solution when encountering lock contention in the TDX module as I saw that there
are quite some magic numbers are going to be introduced because of "retry" and
there were discussions about times of retry should be 3 or 1000 in TDX guest
on hyper-V patches. It doesn't sound right.
Compare to an typical *kernel lock* case, an execution path can wait on a
waitqueue and later will be woken up. We usually do contention-wait-and-retry
and we rarely just do contention and retry X times. In TDX case, I understand
that it is hard for the TDX module to provide similar solutions as an execution
path can't stay long in the TDX module.
1) We can always take tdx_lock (linux kernel lock) when calling a SEAMCALL
that touch the TDX internal locks. But the downside is we might lose some
concurrency.
2) As TDX module doesn't provide contention-and-wait, I guess the following
approach might have been discussed when designing this "retry".
KERNEL TDX MODULE
SEAMCALL A -> PATH A: Taking locks
SEAMCALL B -> PATH B: Contention on a lock
<- Return "operand busy"
SEAMCALL B -|
| <- Wait on a kernel waitqueue
SEAMCALL B <-|
SEAMCALL A <- PATH A: Return
SEAMCALL A -|
| <- Wake up the waitqueue
SEMACALL A <-|
SEAMCALL B -> PATH B: Taking the locks
...
Why not this scheme wasn't chosen?
>
> On Thu, Jan 12, 2023 at 8:34 AM <isaku.yamahata@...el.com> wrote:
> > +static int tdx_reclaim_page(hpa_t pa, bool do_wb, u16 hkid)
> > +{
> > + struct tdx_module_output out;
> > + u64 err;
> > +
> > + do {
> > + err = tdh_phymem_page_reclaim(pa, &out);
> > + /*
> > + * TDH.PHYMEM.PAGE.RECLAIM is allowed only when TD is shutdown.
> > + * state. i.e. destructing TD.
> > + * TDH.PHYMEM.PAGE.RECLAIM requires TDR and target page.
> > + * Because we're destructing TD, it's rare to contend with TDR.
> > + */
> > + } while (err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX));
> > + if (WARN_ON_ONCE(err)) {
> > + pr_tdx_error(TDH_PHYMEM_PAGE_RECLAIM, err, &out);
> > + return -EIO;
> > + }
> > +
> > + if (do_wb) {
> > + /*
> > + * Only TDR page gets into this path. No contention is expected
> > + * because of the last page of TD.
> > + */
> > + err = tdh_phymem_page_wbinvd(set_hkid_to_hpa(pa, hkid));
> > + if (WARN_ON_ONCE(err)) {
> > + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL);
> > + return -EIO;
> > + }
> > + }
> > +
> > + tdx_clear_page(pa);
> > + return 0;
> > +}
Powered by blists - more mailing lists