linux-kernel - Re: Kernel Panic on KVM Guests: "Scheduling while atomic: swapper''

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20110823063953.GA15288@redhat.com>
Date:	Tue, 23 Aug 2011 09:39:53 +0300
From:	Gleb Natapov <gleb@...hat.com>
To:	Iggy Iggy <ignatious1234@...il.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: Kernel Panic on KVM Guests: "Scheduling while atomic: swapper''

On Wed, Aug 17, 2011 at 10:40:15PM -0500, Iggy Iggy wrote:
> I've started seeing kernel panics on a few of our virtual machines
> after moving them (qemu-kvm, libvirt) off of a box with two Intel Xeon
> X5650 processors (12 cores total) onto one with four AMD Opteron 6174
> processors (48 cores total).
> 
> What is odd is that I feel like the panic is moving around on these
> virtual machines. It was only happening on one for a bit and then it
> stopped but started happening on another virtual machine. It also
> doesn't happen all the time but it can also happen frequently. Two
> days of not happening vs every four to six hours. The machine still
> functions to an extent but over time it crawls and needs to be
> destroyed and started back up.
> 
> This is the panic:
> Jul 20 06:35:47 test-db kernel: [10881.413875] BUG: scheduling while
> atomic: swapper/0/0x00010000
> Jul 20 06:35:47 test-db kernel: [10881.414184] Modules linked in:
> nf_conntrack_ftp i2c_piix4 i2c_core joydev virtio_net virtio_balloon
> virtio_blk virtio_pci virtio_ring virtio [last unloaded:
> scsi_wait_scan]
> Jul 20 06:35:47 test-db kernel: [10881.414196] Pid: 0, comm: swapper
> Not tainted 2.6.35.11-83.fc14.x86_64 #1
> Jul 20 06:35:47 test-db kernel: [10881.414198] Call Trace:
> Jul 20 06:35:47 test-db kernel: [10881.414205] [<ffffffff8103ffbe>]
> __schedule_bug+0x5f/0x64
> Jul 20 06:35:47 test-db kernel: [10881.414208] [<ffffffff8146845e>]
> schedule+0xd9/0x5cb
> Jul 20 06:35:47 test-db kernel: [10881.414214] [<ffffffff81072e20>] ?
> hrtimer_start_expires.clone.5+0x1e/0x20
> Jul 20 06:35:47 test-db kernel: [10881.414219] [<ffffffff81008345>]
> cpu_idle+0xca/0xcc
> Jul 20 06:35:47 test-db kernel: [10881.414223] [<ffffffff81451c66>]
> rest_init+0x8a/0x8c
> Jul 20 06:35:47 test-db kernel: [10881.414227] [<ffffffff81ba1c49>]
> start_kernel+0x40b/0x416
> Jul 20 06:35:47 test-db kernel: [10881.414231] [<ffffffff81ba12c6>]
> x86_64_start_reservations+0xb1/0xb5
> Jul 20 06:35:47 test-db kernel: [10881.414234] [<ffffffff81ba13c2>]
> x86_64_start_kernel+0xf8/0x107
> 
> The new server is running Scientific Linux 6.0 with kernel
> 2.6.32-131.6.1.el6.x86_64. One of the guests I see this on is running
> Fedora Core 14, kernel 2.6.35.13-92.fc14.x86_64 and the other is
> running Fedora Core 12, kernel 2.6.32.26-175.fc12.x86_64.
> 
This is RHEL bug [1], not upstream one and should be reported elsewhere.
Just for the record the bug is fixed on the latest RHEL kernel.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=683658

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/