lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJhGHyAksXZHO_BC391ax4rHgQLE=xWeZu6dxN60gSgGij+ZLg@mail.gmail.com>
Date:   Wed, 14 Dec 2022 21:47:55 +0800
From:   Lai Jiangshan <jiangshanlai@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     linux-kernel@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Lai Jiangshan <jiangshan.ljs@...group.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org
Subject: Re: [PATCH 1/2] kvm: x86/mmu: Reduce the update to the spte in FNAME(sync_page)

Hello Sean,

On Wed, Dec 14, 2022 at 2:12 AM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Mon, Dec 12, 2022, Lai Jiangshan wrote:
> > From: Lai Jiangshan <jiangshan.ljs@...group.com>
> >
> > Sometimes when the guest updates its pagetable, it adds only new gptes
> > to it without changing any existed one, so there is no point to update
> > the sptes for these existed gptes.
> >
> > Also when the sptes for these unchanged gptes are updated, the AD
> > bits are also removed since make_spte() is called with prefetch=true
> > which might result unneeded TLB flushing.
>
> If either of the proposed changes is kept, please move this to a separate patch.
> Skipping updates for PTEs with the same protections is separate logical change
> from skipping updates when making the SPTE writable.
>
> Actually, can't we just pass @prefetch=false to make_spte()?  FNAME(prefetch_invalid_gpte)
> has already verified the Accessed bit is set in the GPTE, so at least for guest
> correctness there's no need to access-track the SPTE.  Host page aging is already
> fuzzy so I don't think there are problems there.

FNAME(prefetch_invalid_gpte) has already verified the Accessed bit is set
in the GPTE and FNAME(protect_clean_gpte) has already verified the Dirty
bit is set in the GPTE.  These are only for guest AD bits.

And I don't think it is a good idea to pass @prefetch=false to make_spte(),
since the host might have cleared AD bit in the spte for aging or dirty-log,
The AD bits in the spte are better to be kept as before.

Though passing @prefetch=false would not cause any correctness problem
in the view of maintaining guest AD bits.

>
> > Do nothing if the permissions are unchanged or only write-access is
> > being added.
>
> I'm pretty sure skipping the "make writable" case is architecturally wrong.  On a
> #PF, any TLB entries for the faulting virtual address are required to be removed.
> That means KVM _must_ refresh the SPTE if a vCPU takes a !WRITABLE fault on an
> unsync page.  E.g. see kvm_inject_emulated_page_fault().

I might misunderstand what you meant or I failed to connect it with
the SDM properly.

I think there is no #PF here.

And even if the guest is requesting writable, the hypervisor is allowed to
set it non-writable and prepared to handle it in the ensuing write-fault.

Skipping to make it writable is a kind of lazy operation and considered
to be "the hypervisor doesn't grant the writable permission for a period
before next write-fault".

Thanks
Lai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ