linux-kernel - Re: __schedule #DF splat

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53B02395.8030505@web.de>
Date:	Sun, 29 Jun 2014 16:32:53 +0200
From:	Jan Kiszka <jan.kiszka@....de>
To:	Gleb Natapov <gleb@...nel.org>, Borislav Petkov <bp@...en8.de>
CC:	Paolo Bonzini <pbonzini@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>, x86-ml <x86@...nel.org>,
	kvm@...r.kernel.org,
	Jörg Rödel 
	<joro@...tes.org>
Subject: Re: __schedule #DF splat

On 2014-06-29 16:27, Gleb Natapov wrote:
> On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote:
>> On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote:
>>> Please do so and let us know.
>>
>> Yep, just did. Reverting ae9fedc793 fixes the issue.
>>
>>> reinj:1 means that previous injection failed due to another #PF that
>>> happened during the event injection itself This may happen if GDT or fist
>>> instruction of a fault handler is not mapped by shadow pages, but here
>>> it says that the new page fault is at the same address as the previous
>>> one as if GDT is or #PF handler is mapped there. Strange. Especially
>>> since #DF is injected successfully, so GDT should be fine. May be wrong
>>> cpl makes svm crazy?
>>
>> Well, I'm not going to even pretend to know kvm to know *when* we're
>> saving VMCB state but if we're saving the wrong CPL and then doing the
>> pagetable walk, I can very well imagine if the walker gets confused. One
>> possible issue could be U/S bit (bit 2) in the PTE bits which allows
>> access to supervisor pages only when CPL < 3. I.e., CPL has effect on
>> pagetable walk and a wrong CPL level could break it.
>>
>> All a conjecture though...
>>
> Looks plausible, still strange that second #PF is at the same address as the first one though.
> Anyway, not we have the commit to blame.

I suspect there is a gap between cause and effect. I'm tracing CPL
changes currently, and my first impression is that QEMU triggers an
unwanted switch from CPL 3 to 0 on vmport access:

 qemu-system-x86-11883 [001]  7493.378630: kvm_entry:            vcpu 0
 qemu-system-x86-11883 [001]  7493.378631: bprint:               svm_vcpu_run: entry cpl 0
 qemu-system-x86-11883 [001]  7493.378636: bprint:               svm_vcpu_run: exit cpl 3
 qemu-system-x86-11883 [001]  7493.378637: kvm_exit:             reason io rip 0x400854 info 56580241 400855
 qemu-system-x86-11883 [001]  7493.378640: kvm_emulate_insn:     0:400854:ed (prot64)
 qemu-system-x86-11883 [001]  7493.378642: kvm_userspace_exit:   reason KVM_EXIT_IO (2)
 qemu-system-x86-11883 [001]  7493.378655: bprint:               kvm_arch_vcpu_ioctl_get_sregs: ss.dpl 0
 qemu-system-x86-11883 [001]  7493.378684: bprint:               kvm_arch_vcpu_ioctl_set_sregs: ss.dpl 0
 qemu-system-x86-11883 [001]  7493.378685: bprint:               svm_set_segment: cpl = 0
 qemu-system-x86-11883 [001]  7493.378711: kvm_pio:              pio_read at 0x5658 size 4 count 1 val 0x3442554a 

Yeah... do we have to manually sync save.cpl into ss.dpl on get_sregs
on AMD?

Jan



Download attachment "signature.asc" of type "application/pgp-signature" (264 bytes)