[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240819215738.GA2317872.vipinsh@google.com>
Date: Mon, 19 Aug 2024 14:57:38 -0700
From: Vipin Sharma <vipinsh@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: David Matlack <dmatlack@...gle.com>, pbonzini@...hat.com,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] KVM: x86/mmu: Split NX hugepage recovery flow into
TDP and non-TDP flow
On 2024-08-19 11:31:22, Sean Christopherson wrote:
> On Mon, Aug 19, 2024, David Matlack wrote:
> > On Mon, Aug 19, 2024 at 10:20 AM Vipin Sharma <vipinsh@...gle.com> wrote:
> > >
> > > On 2024-08-16 16:29:11, Sean Christopherson wrote:
> > > > Why not just use separate lists?
> > >
> > > Before this patch, NX huge page recovery calculates "to_zap" and then it
> > > zaps first "to_zap" pages from the common list. This series is trying to
> > > maintain that invarient.
>
> I wouldn't try to maintain any specific behavior in the existing code, AFAIK it's
> 100% arbitrary and wasn't written with any meaningful sophistication. E.g. FIFO
> is little more than blindly zapping pages and hoping for the best.
>
> > > If we use two separate lists then we have to decide how many pages
> > > should be zapped from TDP MMU and shadow MMU list. Few options I can
> > > think of:
> > >
> > > 1. Zap "to_zap" pages from both TDP MMU and shadow MMU list separately.
> > > Effectively, this might double the work for recovery thread.
> > > 2. Try zapping "to_zap" page from one list and if there are not enough
> > > pages to zap then zap from the other list. This can cause starvation.
> > > 3. Do half of "to_zap" from one list and another half from the other
> > > list. This can lead to situations where only half work is being done
> > > by the recovery worker thread.
> > >
> > > Option (1) above seems more reasonable to me.
> >
> > I vote each should zap 1/nx_huge_pages_recovery_ratio of their
> > respective list. i.e. Calculate to_zap separately for each list.
>
> Yeah, I don't have a better idea since this is effectively a quick and dirty
> solution to reduce guest jitter. We can at least add a counter so that the zap
> is proportional to the number of pages on each list, e.g. this, and then do the
> necessary math in the recovery paths.
>
Okay, I will work on v2 which creates two separate lists for NX huge
pages. Use specific counter for TDP MMU and zap based on that.
Powered by blists - more mailing lists