linux-kernel - Re: [Question] int3 instruction generates a #UD in SEV VM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7a4f3f59-1482-49c4-92b2-aa621e9b06b3@amd.com>
Date:   Wed, 2 Aug 2023 09:33:46 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Wu Zongyo <wuzongyo@...l.ustc.edu.cn>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org, x86@...nel.org,
        linux-coco@...ts.linux.dev
Subject: Re: [Question] int3 instruction generates a #UD in SEV VM

On 8/2/23 09:25, Tom Lendacky wrote:
> On 8/2/23 09:01, Sean Christopherson wrote:
>> On Wed, Aug 02, 2023, Wu Zongyo wrote:
>>> On Mon, Jul 31, 2023 at 11:45:29PM +0800, wuzongyong wrote:
>>>>
>>>> On 2023/7/31 23:03, Tom Lendacky wrote:
>>>>> On 7/31/23 09:30, Sean Christopherson wrote:
>>>>>> On Sat, Jul 29, 2023, wuzongyong wrote:
>>>>>>> Hi,
>>>>>>> I am writing a firmware in Rust to support SEV based on project 
>>>>>>> td-shim[1].
>>>>>>> But when I create a SEV VM (just SEV, no SEV-ES and no SEV-SNP) 
>>>>>>> with the firmware,
>>>>>>> the linux kernel crashed because the int3 instruction in 
>>>>>>> int3_selftest() cause a
>>>>>>> #UD.
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>> BTW, if a create a normal VM without SEV by qemu & OVMF, the int3 
>>>>>>> instruction always generates a
>>>>>>> #BP.
>>>>>>> So I am confused now about the behaviour of int3 instruction, could 
>>>>>>> anyone help to explain the behaviour?
>>>>>>> Any suggestion is appreciated!
>>>>>>
>>>>>> Have you tried my suggestions from the other thread[*]?
>>>> Firstly, I'm sorry for sending muliple mails with the same content. I 
>>>> thought the mails I sent previously
>>>> didn't be sent successfully.
>>>> And let's talk the problem here.
>>>>>>
>>>>>>     : > > I'm curious how this happend. I cannot find any condition 
>>>>>> that would
>>>>>>     : > > cause the int3 instruction generate a #UD according to the 
>>>>>> AMD's spec.
>>>>>>     :
>>>>>>     : One possibility is that the value from memory that gets 
>>>>>> executed diverges from the
>>>>>>     : value that is read out be the #UD handler, e.g. due to 
>>>>>> patching (doesn't seem to
>>>>>>     : be the case in this test), stale cache/tlb entries, etc.
>>>>>>     :
>>>>>>     : > > BTW, it worked nomarlly with qemu and ovmf.
>>>>>>     : >
>>>>>>     : > Does this happen every time you boot the guest with your 
>>>>>> firmware? What
>>>>>>     : > processor are you running on?
>>>>>>     :
>>>> Yes, every time.
>>>> The processor I used is EPYC 7T83.
>>>>>>     : And have you ruled out KVM as the culprit?  I.e. verified that 
>>>>>> KVM is NOT injecting
>>>>>>     : a #UD.  That obviously shouldn't happen, but it should be easy 
>>>>>> to check via KVM
>>>>>>     : tracepoints.
>>>>>
>>>>> I have a feeling that KVM is injecting the #UD, but it will take 
>>>>> instrumenting KVM to see which path the #UD is being injected from.
>>>>>
>>>>> Wu Zongyo, can you add some instrumentation to figure that out if the 
>>>>> trace points towards KVM injecting the #UD?
>>>> Ok, I will try to do that.
>>> You're right. The #UD is injected by KVM.
>>>
>>> The path I found is:
>>>      svm_vcpu_run
>>>          svm_complete_interrupts
>>>         kvm_requeue_exception // vector = 3
>>>             kvm_make_request
>>>
>>>      vcpu_enter_guest
>>>          kvm_check_and_inject_events
>>>         svm_inject_exception
>>>             svm_update_soft_interrupt_rip
>>>             __svm_skip_emulated_instruction
>>>                 x86_emulate_instruction
>>>                 svm_can_emulate_instruction
>>>                     kvm_queue_exception(vcpu, UD_VECTOR)
>>>
>>> Does this mean a #PF intercept occur when the guest try to deliver a
>>> #BP through the IDT? But why?
>>
>> I doubt it's a #PF.  A #NPF is much more likely, though it could be 
>> something
>> else entirely, but I'm pretty sure that would require bugs in both the 
>> host and
>> guest.
>>
>> What is the last exit recorded by trace_kvm_exit() before the #UD is 
>> injected?
> 
> I'm guessing it was a #NPF, too. Could it be related to the changes that
> went in around svm_update_soft_interrupt_rip()?
> 
> 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the 
> instruction")

Sorry, that should have been:

7e5b5ef8dca3 ("KVM: SVM: Re-inject INTn instead of retrying the insn on "failure"")

> 
> Before this the !nrips check would prevent the call into
> svm_skip_emulated_instruction(). But now, there is a call to:
> 
>    svm_update_soft_interrupt_rip()
>      __svm_skip_emulated_instruction()
>        kvm_emulate_instruction()
>          x86_emulate_instruction() (passed a NULL insn pointer)
>            kvm_can_emulate_insn() (passed a NULL insn pointer)
>              svm_can_emulate_instruction() (passed NULL insn pointer)
> 
> Because it is an SEV guest, it ends up in the "if (unlikely(!insn))" path
> and injects the #UD.
> 
> Thanks,
> Tom
>