[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <566ABE48.8020408@redhat.com>
Date: Fri, 11 Dec 2015 13:15:04 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Borislav Petkov <bp@...en8.de>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Jörg Rödel <joro@...tes.org>
Subject: Re: [PATCH] kvm: x86: move tracepoints outside extended quiescent
state
On 11/12/2015 12:41, Borislav Petkov wrote:
> On Fri, Dec 11, 2015 at 11:41:30AM +0100, Paolo Bonzini wrote:
>> It would be a kvm hypervisor page, not a kvm guest page, hence unrelated
>> to the zapping thing.
>
> Ah right, guest pages should be userspace addresses, come to think of
> it.
>
>> Can you grab the kallsyms before making it crash?
>
> Attached. It was a different corruption this time, see below. This time
> we don't even have a page table, PGD is 0, rIP is 1. (Fun :-))
Hmm, you had:
- RIP=0 in the original report (start_this_handle)
- RIP=0 in the second (mutex_lock_nested in ext4)
- RIP=1 now
The more interesting one is the other one which doesn't have a small RIP,
because it has RIP that is slightly larger than the stack pointer, meaning
it's likely a frame pointer. And this means in turn that the call trace
is correct, and the bug might have happened closer to the actual corruption.
[ 959.548625] RIP: 0010:[<ffff8800b9f9bdf0>] [<ffff8800b9f9bdf0>] 0xffff8800b9f9bdf0
[ 959.556338] RSP: 0018:ffff8800b9f9bde0 EFLAGS: 00010206
[ 959.618579] Stack:
[ 959.620607] ffffffffa02d5e17 ffff8800b7d48000 ffff8800b9f9be08 ffffffffa02bdb1f
[ 959.628104] 0000000000000000 ffff8800b9f9be98 ffffffffa02bdc7b ffff8804242a4400
[ 959.635601] 0000000000000070 0000000000004000 ffffffff81a3c1e0 ffff8800b7ca5e00
[ 959.643114] Call Trace:
[ 959.645599] [<ffffffffa02d5e17>] ? kvm_arch_vcpu_put+0x17/0x40 [kvm]
[ 959.652081] [<ffffffffa02bdb1f>] ? vcpu_put+0x1f/0x60 [kvm]
[ 959.657782] [<ffffffffa02bdc7b>] ? kvm_vcpu_ioctl+0x11b/0x6f0 [kvm]
[ 959.664169] [<ffffffff811a0930>] ? do_vfs_ioctl+0x2e0/0x540
[ 959.669855] [<ffffffff811ac8e9>] ? __fget_light+0x29/0x90
[ 959.675364] [<ffffffff811a0bdc>] ? SyS_ioctl+0x4c/0x90
[ 959.680618] [<ffffffff816e2d5b>] ? entry_SYSCALL_64_fastpath+0x16/0x6f
My wild guess is that RSP is getting corrupted, but I guess I'll have to try
to reproduce to figure out what happens.
The last thing I need from you (hopefully) is a Kconfig. If you have some
time, it would be great to check if you can reproduce it with an older kernel
version---trying 4.4-rc1 and 4.3 would be great.
Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists