linux-kernel - Re: KVM induced panic on 2.6.38[2367] & 2.6.39

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DE59DE9.2050809@fnarfbargle.com>
Date:	Wed, 01 Jun 2011 10:03:21 +0800
From:	Brad Campbell <lists2009@...rfbargle.com>
To:	Andrea Arcangeli <aarcange@...hat.com>
CC:	Hugh Dickins <hughd@...gle.com>, Borislav Petkov <bp@...en8.de>,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	linux-mm <linux-mm@...ck.org>, Izik Eidus <ieidus@...hat.com>
Subject: Re: KVM induced panic on 2.6.38[2367] & 2.6.39

On 01/06/11 09:15, Andrea Arcangeli wrote:
> Hello,
>
> On Wed, Jun 01, 2011 at 08:37:25AM +0800, Brad Campbell wrote:
>> On 01/06/11 06:31, Hugh Dickins wrote:
>>> Brad, my suspicion is that in each case the top 16 bits of RDX have been
>>> mysteriously corrupted from ffff to 0000, causing the general protection
>>> faults.  I don't understand what that has to do with KSM.
>>>
>>> But it's only a suspicion, because I can't make sense of the "Code:"
>>> lines in your traces, they have more than the expected 64 bytes, and
>>> only one of them has a ">" (with no"<") to mark faulting instruction.
>>>
>>> I did try compiling the 2.6.39 kernel from your config, but of course
>>> we have different compilers, so although I got close, it wasn't exact.
>>>
>>> Would you mind mailing me privately (it's about 73MB) the "objdump -trd"
>>> output for your original vmlinux (with KSM on)?  (Those -trd options are
>>> the ones I'm used to typing, I bet not they're not all relevant.)
>>>
>>> Of course, it's only a tiny fraction of that output that I need,
>>> might be better to cut it down to remove_rmap_item_from_tree and
>>> dup_fd and ksm_scan_thread, if you have the time to do so.
>>
>> Would you believe about 20 seconds after I pressed send the kernel oopsed.
>>
>> http://www.fnarfbargle.com/private/003_kernel_oops/
>>
>> oops reproduced here, but an un-munged version is in that directory
>> alongside the kernel.
>>
>> [36542.880228] general protection fault: 0000 [#1] SMP
>
> Reminds me of another oops that was reported on the kvm list for
> 2.6.38.1 with message id 4D8C6110.6090204. There the top 16 bits of
> rsi were flipped and it was a general protection too because of
> hitting on the not mappable virtual range.
>
> http://www.virtall.com/files/temp/kvm.txt
> http://www.virtall.com/files/temp/config-2.6.38.1
> http://virtall.com/files/temp/mmu-objdump.txt
>
> That oops happened in kvm_unmap_rmapp though, but it looked memory
> corruption (Avi suggested use after free) but it was a production
> system so we couldn't debug it further.
>
> I recommend next thing to reproduce again with 2.6.39 or
> 3.0.0-rc1. Let's fix your scsi trouble if needed but it's better you
> test with 2.6.39.
>
> We'd need chmod +r vmlinux on private/003_kernel_oops/

Ok, here we go then.

http://www.fnarfbargle.com/private/004_kernel_oops/

The permissions are right this time.
2.6.39 + KSM

[  694.227866] general protection fault: 0000 [#1] SMP
[  694.228001] last sysfs file: /sys/devices/platform/w83627ehf.656/cpu0_vid
[  694.228050] CPU 3
[  694.228091] Modules linked in: xt_iprange xt_DSCP xt_length 
xt_CLASSIFY sch_sfq xt_CHECKSUM ipt_REJECT ipt_MASQUERADE ipt_REDIRECT 
xt_recent xt_state iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 
nf_conntrack nf_defrag_ipv4 xt_TCPMSS xt_tcpmss xt_tcpudp iptable_mangle 
ip_tables x_tables pppoe pppox ppp_generic slhc cls_u32 sch_htb deflate 
zlib_deflate des_generic cbc ecb crypto_blkcipher sha1_generic md5 hmac 
crypto_hash cryptomgr aead crypto_algapi af_key fuse w83627ehf hwmon_vid 
netconsole configfs vhost_net powernow_k8 mperf kvm_amd kvm pl2303 
usbserial i2c_piix4 k10temp xhci_hcd usb_storage usb_libusual ohci_hcd 
r8169 ehci_hcd ahci usbcore sata_mv mii libahci megaraid_sas [last 
unloaded: scsi_wait_scan]
[  694.230897]
[  694.230944] Pid: 11841, comm: keepalive Not tainted 2.6.39 #3 To Be 
Filled By O.E.M. To Be Filled By O.E.M./880G Extreme3
[  694.231111] RIP: 0010:[<ffffffff810db878>]  [<ffffffff810db878>] 
dup_fd+0x168/0x300
[  694.231210] RSP: 0018:ffff8802f524fdd0  EFLAGS: 00010206
[  694.231258] RAX: 00000000000007f8 RBX: ffff8802f5721b80 RCX: 
bfffffffffffffff
[  694.231308] RDX: 00008802f51cacc0 RSI: 00000000000000ff RDI: 
0000000000000800
[  694.231358] RBP: ffff8803bf419800 R08: ffff88030167f6c0 R09: 
0000000000000003
[  694.231407] R10: 0000000000000001 R11: 4000000000000000 R12: 
0000000000000100
[  694.231457] R13: ffff880417aa9800 R14: ffff88030167f440 R15: 
ffff8803bd8c1600
[  694.231507] FS:  00007f02cfc32700(0000) GS:ffff88041fcc0000(0000) 
knlGS:0000000000000000
[  694.231560] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  694.231609] CR2: 00007f02cf5d4810 CR3: 00000002f52c3000 CR4: 
00000000000006e0
[  694.231657] DR0: 0000000000000045 DR1: 0000000000000000 DR2: 
0000000000000000
[  694.231707] DR3: 0000000000000005 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  694.231757] Process keepalive (pid: 11841, threadinfo 
ffff8802f524e000, task ffff8802f5143690)
[  694.231809] Stack:
[  694.231852]  ffff8802f5143690 0000000000000020 ffff8802f56badc0 
ffff8802f5721b90
[  694.232050]  ffff880417aa54e0 0000000001200011 ffff880417aa54e0 
0000000000000000
[  694.232248]  00007f02cfc329d0 ffff8802f5143690 0000000000000000 
ffffffff81037645
[  694.232448] Call Trace:
[  694.232499]  [<ffffffff81037645>] ? copy_process+0xa75/0xfd0
[  694.232549]  [<ffffffff81037c0d>] ? do_fork+0x6d/0x2b0
[  694.232599]  [<ffffffff810457a9>] ? sigprocmask+0x69/0x100
[  694.232651]  [<ffffffff813d0ca3>] ? stub_clone+0x13/0x20
[  694.232699]  [<ffffffff813d0a3b>] ? system_call_fastpath+0x16/0x1b
[  694.232745] Code: 4c 89 c2 e8 6b e5 0f 00 45 85 e4 74 78 41 8d 44 24 
ff 31 f6 41 ba 01 00 00 00 48 8d 3c c5 08 00 00 00 31 c0 eb 1a 0f 1f 44 
00 00 <f0> 48 ff 42 30 48 89 54 05 00 48 83 c0 08 ff c6 48 39 f8 74 3b
[  694.235190] RIP  [<ffffffff810db878>] dup_fd+0x168/0x300
[  694.235282]  RSP <ffff8802f524fdd0>
[  694.235379] ---[ end trace 949fad05591fcdb3 ]---
[  694.235428] Kernel panic - not syncing: Fatal exception
[  694.235478] Pid: 11841, comm: keepalive Tainted: G      D     2.6.39 #3
[  694.235525] Call Trace:
[  694.235573]  [<ffffffff813cd6f5>] ? panic+0x92/0x18a
[  694.235624]  [<ffffffff81038b61>] ? kmsg_dump+0x41/0xf0
[  694.235676]  [<ffffffff810050ad>] ? oops_end+0x8d/0xa0
[  694.235726]  [<ffffffff813d05ef>] ? general_protection+0x1f/0x30
[  694.235778]  [<ffffffff810db878>] ? dup_fd+0x168/0x300
[  694.235827]  [<ffffffff81037645>] ? copy_process+0xa75/0xfd0
[  694.235877]  [<ffffffff81037c0d>] ? do_fork+0x6d/0x2b0
[  694.235926]  [<ffffffff810457a9>] ? sigprocmask+0x69/0x100
[  694.235978]  [<ffffffff813d0ca3>] ? stub_clone+0x13/0x20
[  694.236028]  [<ffffffff813d0a3b>] ? system_call_fastpath+0x16/0x1b
[  694.236083] Rebooting in 60 seconds..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/