linux-kernel - Re: [PATCH RFC] KVM: SVM: reduce guest MAXPHYADDR by one in case C-bit is a physical bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87bl3mye23.fsf@vitty.brq.redhat.com>
Date:   Mon, 18 Oct 2021 09:42:12 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>
Cc:     kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Tom Lendacky <thomas.lendacky@....com>,
        David Matlack <dmatlack@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] KVM: SVM: reduce guest MAXPHYADDR by one in case
 C-bit is a physical bit

Maxim Levitsky <mlevitsk@...hat.com> writes:

> On Fri, 2021-10-15 at 15:24 +0000, Sean Christopherson wrote:
>> On Fri, Oct 15, 2021, Vitaly Kuznetsov wrote:
>> > Several selftests (memslot_modification_stress_test, kvm_page_table_test,
>> > dirty_log_perf_test,.. ) which rely on vm_get_max_gfn() started to fail
>> > since commit ef4c9f4f65462 ("KVM: selftests: Fix 32-bit truncation of
>> > vm_get_max_gfn()") on AMD EPYC 7401P:
>> > 
>> >  ./tools/testing/selftests/kvm/demand_paging_test
>> >  Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
>> >  guest physical test memory offset: 0xffffbffff000
>> 
>> This look a lot like the signature I remember from the original bug[1].  I assume
>> you're hitting the magic HyperTransport region[2].  I thought that was fixed, but
>> the hack-a-fix for selftests never got applied[3].
>
> Hi Vitaly and everyone!
>
> You are the 3rd person to suffer from this issue :-( Sean Christopherson was first, I was second.
>
> I reported this, then I think we found out that it is not the HyperTransport region after all,
> and I think that the whole thing lost in 'trying to get answers from AMD'.
>
> https://lore.kernel.org/lkml/ac72b77c-f633-923b-8019-69347db706be@redhat.com/
>
>
> I'll say, a hack to reduce it by 1 bit is still better that failing tests,
> at least until AMD explains to us, about what is going on.
>
> Sorry that you had to debug this.

I didn't spend too much time on this, that's the reson for 'RFC' :-) I
agree we need at least a short-term solution as permanently failing
tests may start masking newly introduces issues.

-- 
Vitaly