linux-kernel - Re: [PATCH] kvm: x86: move tracepoints outside extended quiescent state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <566ABE48.8020408@redhat.com>
Date:	Fri, 11 Dec 2015 13:15:04 +0100
From:	Paolo Bonzini <pbonzini@...hat.com>
To:	Borislav Petkov <bp@...en8.de>
Cc:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Jörg Rödel <joro@...tes.org>
Subject: Re: [PATCH] kvm: x86: move tracepoints outside extended quiescent
 state



On 11/12/2015 12:41, Borislav Petkov wrote:
> On Fri, Dec 11, 2015 at 11:41:30AM +0100, Paolo Bonzini wrote:
>> It would be a kvm hypervisor page, not a kvm guest page, hence unrelated
>> to the zapping thing.
> 
> Ah right, guest pages should be userspace addresses, come to think of
> it.
> 
>> Can you grab the kallsyms before making it crash?
> 
> Attached. It was a different corruption this time, see below. This time
> we don't even have a page table, PGD is 0, rIP is 1. (Fun :-))

Hmm, you had:

- RIP=0 in the original report (start_this_handle)
- RIP=0 in the second (mutex_lock_nested in ext4)
- RIP=1 now

The more interesting one is the other one which doesn't have a small RIP,
because it has RIP that is slightly larger than the stack pointer, meaning
it's likely a frame pointer.  And this means in turn that the call trace
is correct, and the bug might have happened closer to the actual corruption.

[  959.548625] RIP: 0010:[<ffff8800b9f9bdf0>]  [<ffff8800b9f9bdf0>] 0xffff8800b9f9bdf0
[  959.556338] RSP: 0018:ffff8800b9f9bde0  EFLAGS: 00010206
[  959.618579] Stack:
[  959.620607]  ffffffffa02d5e17 ffff8800b7d48000 ffff8800b9f9be08 ffffffffa02bdb1f
[  959.628104]  0000000000000000 ffff8800b9f9be98 ffffffffa02bdc7b ffff8804242a4400
[  959.635601]  0000000000000070 0000000000004000 ffffffff81a3c1e0 ffff8800b7ca5e00
[  959.643114] Call Trace:
[  959.645599]  [<ffffffffa02d5e17>] ? kvm_arch_vcpu_put+0x17/0x40 [kvm]
[  959.652081]  [<ffffffffa02bdb1f>] ? vcpu_put+0x1f/0x60 [kvm]
[  959.657782]  [<ffffffffa02bdc7b>] ? kvm_vcpu_ioctl+0x11b/0x6f0 [kvm]
[  959.664169]  [<ffffffff811a0930>] ? do_vfs_ioctl+0x2e0/0x540
[  959.669855]  [<ffffffff811ac8e9>] ? __fget_light+0x29/0x90
[  959.675364]  [<ffffffff811a0bdc>] ? SyS_ioctl+0x4c/0x90
[  959.680618]  [<ffffffff816e2d5b>] ? entry_SYSCALL_64_fastpath+0x16/0x6f

My wild guess is that RSP is getting corrupted, but I guess I'll have to try
to reproduce to figure out what happens.

The last thing I need from you (hopefully) is a Kconfig.  If you have some
time, it would be great to check if you can reproduce it with an older kernel
version---trying 4.4-rc1 and 4.3 would be great.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/