[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ztl9NWCOupNfVaCA@yzhao56-desk.sh.intel.com>
Date: Thu, 5 Sep 2024 17:43:17 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Vitaly Kuznetsov <vkuznets@...hat.com>, Gerd Hoffmann <kraxel@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>, <kvm@...r.kernel.org>,
<rcu@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Kevin Tian
<kevin.tian@...el.com>, Yiwei Zhang <zzyiwei@...gle.com>, Lai Jiangshan
<jiangshanlai@...il.com>, "Paul E. McKenney" <paulmck@...nel.org>, "Josh
Triplett" <josh@...htriplett.org>
Subject: Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that
support self-snoop
On Wed, Sep 04, 2024 at 05:41:06PM -0700, Sean Christopherson wrote:
> On Wed, Sep 04, 2024, Yan Zhao wrote:
> > On Wed, Sep 04, 2024 at 10:28:02AM +0800, Yan Zhao wrote:
> > > On Tue, Sep 03, 2024 at 06:20:27PM +0200, Vitaly Kuznetsov wrote:
> > > > Sean Christopherson <seanjc@...gle.com> writes:
> > > >
> > > > > On Mon, Sep 02, 2024, Vitaly Kuznetsov wrote:
> > > > >> FWIW, I use QEMU-9.0 from the same C10S (qemu-kvm-9.0.0-7.el10.x86_64)
> > > > >> but I don't think it matters in this case. My CPU is "Intel(R) Xeon(R)
> > > > >> Silver 4410Y".
> > > > >
> > > > > Has this been reproduced on any other hardware besides SPR? I.e. did we stumble
> > > > > on another hardware issue?
> > > >
> > > > Very possible, as according to Yan Zhao this doesn't reproduce on at
> > > > least "Coffee Lake-S". Let me try to grab some random hardware around
> > > > and I'll be back with my observations.
> > >
> > > Update some new findings from my side:
> > >
> > > BAR 0 of bochs VGA (fb_map) is used for frame buffer, covering phys range
> > > from 0xfd000000 to 0xfe000000.
> > >
> > > On "Sapphire Rapids XCC":
> > >
> > > 1. If KVM forces this fb_map range to be WC+IPAT, installer/gdm can launch
> > > correctly.
> > > i.e.
> > > if (gfn >= 0xfd000 && gfn < 0xfe000) {
> > > return (MTRR_TYPE_WRCOMB << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
> > > }
> > > return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT;
> > >
> > > 2. If KVM forces this fb_map range to be UC+IPAT, installer failes to show / gdm
> > > restarts endlessly. (though on Coffee Lake-S, installer/gdm can launch
> > > correctly in this case).
> > >
> > > 3. On starting GDM, ttm_kmap_iter_linear_io_init() in guest is called to set
> > > this fb_map range as WC, with
> > > iosys_map_set_vaddr_iomem(&iter_io->dmap, ioremap_wc(mem->bus.offset, mem->size));
> > >
> > > However, during bochs_pci_probe()-->bochs_load()-->bochs_hw_init(), pfns for
> > > this fb_map has been reserved as uc- by ioremap().
> > > Then, the ioremap_wc() during starting GDM will only map guest PAT with UC-.
> > >
> > > So, with KVM setting WB (no IPAT) to this fb_map range, the effective
> > > memory type is UC- and installer/gdm restarts endlessly.
> > >
> > > 4. If KVM sets WB (no IPAT) to this fb_map range, and changes guest bochs driver
> > > to call ioremap_wc() instead in bochs_hw_init(), gdm can launch correctly.
> > > (didn't verify the installer's case as I can't update the driver in that case).
> > >
> > > The reason is that the ioremap_wc() called during starting GDM will no longer
> > > meet conflict and can map guest PAT as WC.
>
> Huh. The upside of this is that it sounds like there's nothing broken with WC
> or self-snoop.
Considering a different perspective, the fb_map range is used as frame buffer
(vram), with the guest writing to this range and the host reading from it.
If the issue were related to self-snooping, we would expect the VNC window to
display distorted data. However, the observed behavior is that the GDM window
shows up correctly for a sec and restarts over and over.
So, do you think we can simply fix this issue by calling ioremap_wc() for the
frame buffer/vram range in bochs driver, as is commonly done in other gpu
drivers?
--- a/drivers/gpu/drm/tiny/bochs.c
+++ b/drivers/gpu/drm/tiny/bochs.c
@@ -261,7 +261,9 @@ static int bochs_hw_init(struct drm_device *dev)
if (pci_request_region(pdev, 0, "bochs-drm") != 0)
DRM_WARN("Cannot request framebuffer, boot fb still active?\n");
- bochs->fb_map = ioremap(addr, size);
+ bochs->fb_map = ioremap_wc(addr, size);
if (bochs->fb_map == NULL) {
DRM_ERROR("Cannot map framebuffer\n");
return -ENOMEM;
>
> > > WIP to find out why effective UC in fb_map range will make gdm to restart
> > > endlessly.
> > Not sure whether it's simply because UC is too slow.
> >
> > T=Test execution time of a selftest in which guest writes to a GPA for
> > 0x1000000UL times
> >
> > | Sapphire Rapids XCC | Coffee Lake-S
> > --------------|----------------------|-----------------
> > KVM UC+IPAT | T=0m4.530s | T=0m0.622s
>
> Woah. Have you tried testing MOVDIR64 and/or WT? E.g. to see if the problem is
> with UC specifically, or if it occurs with any accesses that immediately write
> through to main memory.
>
> > --------------|----------------------|-----------------
> > KVM WC+IPAT | T=0m0.149s | T=0m0.176s
> > --------------|----------------------|-----------------
> > KVM WB+IPAT | T=0m0.148s | T=0m0.148s
> > ------------------------------------------------------
I re-run all the tests and collected an averaged data (10 times each) as
below (previous data was just a single-run score):
T=Test execution time of a selftest in which guest writes to a GPA for
0x1000000UL times with WRITE_ONCE
KVM memtype | Sapphire Rapids XCC | Coffee Lake-S
-------------|---------------------|----------------
WB+IPAT | T=0.1511s | T=0.1661s
-------------|---------------------|----------------
WC+IPAT | T=0.1411s | T=0.1656s
-------------|---------------------|----------------
WT+IPAT | T=3.7527s | T=0.6156s
-------------|---------------------|----------------
WP+IPAT | T=4.4663s | T=0.6203s
-------------|---------------------|----------------
UC+IPAT | T=3.4632s | T=0.5868s
T=Test execution time of a selftest in which guest writes to a GPA for
0x1000000UL times with movdir64b.
(Coffee Lake-S has no feature movdir64).
KVM memtype | Sapphire Rapids XCC | Coffee Lake-S
-------------|---------------------|----------------
WB+IPAT | T=2.6142s | /
-------------|---------------------|----------------
WC+IPAT | T=2.8919s | /
-------------|---------------------|----------------
WT+IPAT | T=3.0966s | /
-------------|---------------------|----------------
WP+IPAT | T=2.4933s | /
-------------|---------------------|----------------
UC+IPAT | T=3.4606s | /
Powered by blists - more mailing lists