linux-kernel - Re: [GIT PULL] Core block IO bits for 2.6.39

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4D8C43A0.3070908@fusionio.com>
Date:	Fri, 25 Mar 2011 08:26:24 +0100
From:	Jens Axboe <jaxboe@...ionio.com>
To:	Dave Chinner <david@...morbit.com>
CC:	Markus Trippelsdorf <markus@...ppelsdorf.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...cle.com>
Subject: Re: [GIT PULL] Core block IO bits for 2.6.39 - early Oops

On 2011-03-25 05:41, Dave Chinner wrote:
> On Thu, Mar 24, 2011 at 08:34:41PM +0100, Markus Trippelsdorf wrote:
>> On 2011.03.24 at 19:58 +0100, Jens Axboe wrote:
>>> On 2011-03-24 19:54, Markus Trippelsdorf wrote:
>>>> On 2011.03.24 at 19:51 +0100, Jens Axboe wrote:
>>>>> On 2011-03-24 19:36, Jens Axboe wrote:
>>>>>> On 2011-03-24 19:30, Markus Trippelsdorf wrote:
>>>>>>> On 2011.03.24 at 14:43 +0100, Jens Axboe wrote:
>>>>>>>>
>>>>>>>> This is the main pull request for the block IO layer and friends for
>>>>>>>> 2.6.39.
>>>>>>>
>>>>>>> This merge results in an early oops on my system (amd64, xfs).
>>>>>>> See the attached photo.
>>>>>>>
>>>>>>
>>>>>> Auch. Can you ensure that you have CONFIG_DEBUGINFO=y in your .config
>>>>>> and then do:
>>>>>>
>>>>>> $ gdb vmlinux
>>>>>> ...
>>>>>> l *cfq_insert_request+0x32
>>>>>>
>>>>>> and send that output?
>>>>>
>>>>> I took a closer look at the oops, and it most likely looks like q ==
>>>>> NULL (offset 0x18 == q->elevator). You left out the Code part, so I
>>>>> can't verify that for certain. Which makes very little sense. I take it
>>>>> this is 100% reproducible? When you send the gdb output, please also
>>>>> attach your .config.
>>>>
>>>> Yes, it's 100% reproducible here. My .config follows:
>>>
>>> Can you try this patch and see if it makes a difference?
>>
>> There's no patch ;-)
>>
>>> If you boot without the patch and add elevator=noop, does it then work?
>>
>> It works insofar as the Oops is gone. But my xfs partitions apparently
>> still get corrupted (I had to run xfs_repair on several of them, because
>> they would not mount otherwise).
> 
> So the patchset is causing repeatable filesystem corruption? Sounds
> to me like this series is not yet ready for mainline merging. Last
> thing I want to spend the .39 cycle helping people recover busted
> filesystems as a result of undercooked block layer changes...

Well, the last thing I want to do is be responsible for screwing peoples
file systems. I have been running these changes on my laptop, desktop,
and test machines for the last month at least. It's been in linux-next
for about that long, too. I'm extremely puzzled at this issue that
Markus reports.

So believe me, if we can't resolve this very quickly then we'll pull it
back out.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/