[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bl3mye23.fsf@vitty.brq.redhat.com>
Date: Mon, 18 Oct 2021 09:42:12 +0200
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: Maxim Levitsky <mlevitsk@...hat.com>,
Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Tom Lendacky <thomas.lendacky@....com>,
David Matlack <dmatlack@...gle.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] KVM: SVM: reduce guest MAXPHYADDR by one in case
C-bit is a physical bit
Maxim Levitsky <mlevitsk@...hat.com> writes:
> On Fri, 2021-10-15 at 15:24 +0000, Sean Christopherson wrote:
>> On Fri, Oct 15, 2021, Vitaly Kuznetsov wrote:
>> > Several selftests (memslot_modification_stress_test, kvm_page_table_test,
>> > dirty_log_perf_test,.. ) which rely on vm_get_max_gfn() started to fail
>> > since commit ef4c9f4f65462 ("KVM: selftests: Fix 32-bit truncation of
>> > vm_get_max_gfn()") on AMD EPYC 7401P:
>> >
>> > ./tools/testing/selftests/kvm/demand_paging_test
>> > Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages
>> > guest physical test memory offset: 0xffffbffff000
>>
>> This look a lot like the signature I remember from the original bug[1]. I assume
>> you're hitting the magic HyperTransport region[2]. I thought that was fixed, but
>> the hack-a-fix for selftests never got applied[3].
>
> Hi Vitaly and everyone!
>
> You are the 3rd person to suffer from this issue :-( Sean Christopherson was first, I was second.
>
> I reported this, then I think we found out that it is not the HyperTransport region after all,
> and I think that the whole thing lost in 'trying to get answers from AMD'.
>
> https://lore.kernel.org/lkml/ac72b77c-f633-923b-8019-69347db706be@redhat.com/
>
>
> I'll say, a hack to reduce it by 1 bit is still better that failing tests,
> at least until AMD explains to us, about what is going on.
>
> Sorry that you had to debug this.
I didn't spend too much time on this, that's the reson for 'RFC' :-) I
agree we need at least a short-term solution as permanently failing
tests may start masking newly introduces issues.
--
Vitaly
Powered by blists - more mailing lists