lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 16 Feb 2012 09:35:57 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Avi Kivity <avi@...hat.com>
Cc:	Amit Shah <amit.shah@...hat.com>, tglx@...utronix.de,
	mingo@...hat.com, hpa@...or.com, x86@...nel.org,
	linux-kernel@...r.kernel.org, kvm list <kvm@...r.kernel.org>
Subject: Re: [KVM paravirt issue?] Re: vsyscall=emulate regression

On Thu, Feb 16, 2012 at 9:14 AM, Avi Kivity <avi@...hat.com> wrote:
> On 02/16/2012 06:45 PM, Andy Lutomirski wrote:
>> >
>> >> So I could have messed up, or there could be a subtle
>> >> bug somewhere.  Any ideas?
>> >
>> > What's the code trying to do?  Execute an instruction from an
>> > non-executable page, trap the #PF, and emulate?  And what are the
>> > symptoms? wrong error code for the #PF?  That could easily be a kvm bug.
>> >
>>
>> The symptom is that some kind of access to a page that's supposed to
>> be readable, NX is reporting error 5.  I'm not quite sure what kind of
>> access is causing that.
>
> Might it be a fetch access, with kvm forgetting to set bit 4 correctly?
>
>> >
>> > Can you point me at the code in question?
>>
>> The setup code is in arch/x86/kernel/vsyscall_64.c in map_vsyscall.
>> The bad access is to the vsyscall page.
>
> The bad access is on purpose, yes?
>
> From fault.c:
>
> #ifdef CONFIG_X86_64
>                /*
>                 * Instruction fetch faults in the vsyscall page might need
>                 * emulation.
>                 */
>                if (unlikely((error_code & PF_INSTR) &&
>                             ((address & ~0xfff) == VSYSCALL_START))) {
>                        if (emulate_vsyscall(regs, address))
>                                return;
>                }
> #endif
>
> so it seems like kvm doesn't set PF_INSTR?

Yes, this is on purpose, and you're almost certainly right (and I feel
dumb for not figuring this out immediately).  The error message is:

segfault at ffffffffff600400 ip ffffffffff600400 sp 00007fff103d72f8 error 5

which is garbage.  The instruction at 0xffffffffff600400 can't fetch
itself as data and fault on the data access (at least not in 64-bit
mode, as far as I can think of, without evil messing with the TLBs).

So... what do we do about this?  This (whitespace-damaged, untested)
patch will probably work around it well enough to boot the system:

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 9d74824..52b9522 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -741,8 +741,11 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long
                 * Instruction fetch faults in the vsyscall page might need
                 * emulation.
                 */
-               if (unlikely((error_code & PF_INSTR) &&
+               if (unlikely(address == regs->ip && !(error_code & PF_WRITE) &&
                             ((address & ~0xfff) == VSYSCALL_START))) {
+                       WARN_ONCE(!(error_code & PF_INSTR),
+                                 "Fixing up bogus vsyscall read fault -- "
+                                 "your hypervisor is buggy.");
                        if (emulate_vsyscall(regs, address))
                                return;
                }

Before we patch the guest like this, though, it would be nice to know
what hosts are affected.  If it's just one version of RHEL6, maybe it
makes sense to fix the hypervisor and either leave the guest alone or
just add a warning saying to fix your hypervisor, like:

WARN_ONCE(address == regs->ip && !(error_code & (PF_INSTR | PF_WRITE))
&& user_64bit_mode(regs), "Fishy page fault -- you might need to fix
your hypervisor");

near some exit path in the page fault handler.  The 64-bit check is
because (I think) 32-bit code can mess with regs->ip using a cs offset
in the LDT and trigger the warning at will.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ