linux-kernel - Re: [bug, 2.6.26-rc4/rc5] sporadic bootup crashes in blk_lookup_devt()/prepare

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <19f34abd0806090335p36e3e8a1kf180dda236356754@mail.gmail.com>
Date:	Mon, 9 Jun 2008 12:35:05 +0200
From:	"Vegard Nossum" <vegard.nossum@...il.com>
To:	"Andrew Morton" <akpm@...ux-foundation.org>
Cc:	"Ingo Molnar" <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	"Jens Axboe" <jens.axboe@...cle.com>,
	"Greg Kroah-Hartman" <gregkh@...e.de>,
	"Linus Torvalds" <torvalds@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: [bug, 2.6.26-rc4/rc5] sporadic bootup crashes in blk_lookup_devt()/prepare_namespace()

On Mon, Jun 9, 2008 at 11:09 AM, Vegard Nossum <vegard.nossum@...il.com> wrote:
> On Mon, Jun 9, 2008 at 11:06 AM, Andrew Morton
> <akpm@...ux-foundation.org> wrote:
>> On Mon, 9 Jun 2008 10:03:12 +0200 Ingo Molnar <mingo@...e.hu> wrote:
>>
>>> -tip testing has started triggering a new type of sporadic bootup crash
>>> a few days ago. Find below a collection of 14 crashes i've managed to
>>> capture so far, which are all similar to this crash pattern:
>>>
>>>  BUG: unable to handle kernel paging request at ffff81003b984fb8
>>>  IP: [<ffffffff803fafd4>] blk_lookup_devt+0x42/0xa0
>>>  PGD 8063 PUD 9063 PMD 3be2d163 PTE 800000003b984160
>>>  Oops: 0000 [1] SMP DEBUG_PAGEALLOC
>>>
>>>  Call Trace:
>>>   [<ffffffff80bac17b>] ? ip_auto_config+0x0/0xd94
>>>   [<ffffffff80209259>] name_to_dev_t+0x145/0xeec
>>>   [<ffffffff803ff2be>] ? __next_cpu_nr+0x22/0x2b
>>>   [<ffffffff80b7f372>] prepare_namespace+0x91/0x14c
>>>   [<ffffffff80b7eb70>] kernel_init+0x2fe/0x314
>>>   [<ffffffff80251f3d>] ? trace_hardirqs_on_caller+0xca/0xee
>>>   [<ffffffff80741bbb>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>>   [<ffffffff80251f3d>] ? trace_hardirqs_on_caller+0xca/0xee
>>>   [<ffffffff8020d3f8>] child_rip+0xa/0x12
>>>   [<ffffffff8020c90c>] ? restore_args+0x0/0x30
>>>   [<ffffffff8025068d>] ? trace_hardirqs_off+0xd/0xf
>>>   [<ffffffff80b7e872>] ? kernel_init+0x0/0x314
>>>   [<ffffffff8020d3ee>] ? child_rip+0x0/0x12
>>
>> Did you work out where it's dying?  Deref of `dev' I assume?
>
>                        struct gendisk *disk = dev_to_disk(dev);
>

I'm sorry, this is slightly misleading. The dev_to_disk() doesn't contain any dereferences, so therefore that can obviously not be the source of the page fault. It is just simple pointer arithmetic.

The actual dereference happens on the next line, but it appears that this dereference and the pointer magic above is collapsed by gcc into a single instruction, cmp -0x44(%ebx), %esi. I assume the -0x44 would be = 0 - offsetof(device in gendisk) + offsetof(minors in gendisk).

So the error seems to be in dereferencing disk->minors, not dev.

And the fact that this causes a page fault seems to be pure luck; if the struct device object is placed higher than 0x44 in a page, it won't give the page fault (but simply access some valid, random memory). There seems to be a pretty good chance of an address being offset more than 0x44 bytes within a page given that a whole page is 0x1000 bytes :-)

The other condition that must be present for this fault to trigger is that the previous page must not have been mapped. Ouch. That sounds like two rare conditions!


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/