linux-kernel - Re: [PANIC, hyperv] BUG: unable to handle kernel paging request at ffff880077800004 (hv_ringbuffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140825174132.GA17681@sucs.org>
Date:	Mon, 25 Aug 2014 18:41:32 +0100
From:	Sitsofe Wheeler <sitsofe@...il.com>
To:	Dexuan Cui <decui@...rosoft.com>
Cc:	KY Srinivasan <kys@...rosoft.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Jean-Christophe Plagniol-Villard <plagnioj@...osoft.com>,
	"linux-fbdev@...r.kernel.org" <linux-fbdev@...r.kernel.org>
Subject: Re: [PANIC, hyperv] BUG: unable to handle kernel paging request at
 ffff880077800004 (hv_ringbuffer_write)

Hi Dexuan,

On Mon, Aug 25, 2014 at 02:02:21PM +0000, Dexuan Cui wrote:
> > -----Original Message-----
> > From: Sitsofe Wheeler
> > Sent: Wednesday, August 20, 2014 17:27 PM
> > 
> > While booting a Hyper-V 3.17.0-rc1 guest on a 2012 R2 host a BUG was
> > triggered while registering hyperv_fb which in turn caused a panic.
> > Various kernel debugging options (CONFIG_DEBUG_PAGEALLOC,
> > CONFIG_SLUB_DEBUG=y...) were on at the time. This only seems to happen
> > if the guest is being booted with only one CPU allocated to it.
>  
> I can reproduce the exact issue with the same commit + your kconfig + UP
> guest (SMP guest seems ok.)

Thanks for getting back - I was wondering if my mails had dropped into a
black hole as I haven't heard anything on any of them for a few days
(and no one had mentioned they had been able to reproduce the issues
reported).

> > [    7.645526] hv_vmbus: registering driver hyperv_fb
> > [    7.657553] BUG: unable to handle kernel paging request at
> > ffff880077800004
> > [    7.658224] IP: [<ffffffff8159a7ac>] hv_ringbuffer_write+0x7c/0x150
> > [    7.658224] PGD 2da9067 PUD 2dac067 PMD 7fa27067 PTE
> > 8000000077800060
> > [    7.658224] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> It seems 
> hv_ringbuffer_write() -> 
>     hv_get_ringbuffer_availbytes():
>         reading rbi->ring_buffer->read_index causes a page fault.
> 
> It looks rbi->ring_buffer was unmapped somehow according to the
> semantics of CONFIG_DEBUG_PAGEALLOC??? Or, was there a memory
> corruption somewhere?
> 
> It looks the panic will disappear if the guest isn't configured with a 
> "Network Adapter ".

This sounds very fishy as if network setup has left things in a bad
state. What is baffles me is the whole UP vs SMP thing - why would UP
make this show up consistently? Perhaps some assertions could be added
to check that rbi->ring_buffer still has sane values in it after
operations on it are finished?

I guess you could try switching things around and using
kmemcheck (https://www.kernel.org/doc/Documentation/kmemcheck.txt ). If
the whole area close to rbi->ring_buffer->read_index is being stomped on
it should show up. If it's just being set to a duff value or freed that
going to be harder to track down although poisoning before freeing
should allow us to distinguish that case...

>From your analysis this doesn't sound framebuffer related - perhaps we
could drop the linuxfb CC's on these mails going forward?

-- 
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/