[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5016D2C0.6090708@vetienne.net>
Date: Mon, 30 Jul 2012 20:30:24 +0200
From: Vincent ETIENNE <ve@...ienne.net>
To: Vincent ETIENNE <vetienne@...ogsys.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>,
ocfs2-devel@....oracle.com
Subject: Re: kernel BUG at fs/buffer.c:2886! Linux 3.5.0
On 30/07/2012 09:53, Joel Becker wrote:
> On Mon, Jul 30, 2012 at 09:45:14AM +0200, Vincent ETIENNE wrote:
>> Le 30/07/2012 08:30, Joel Becker a écrit :
>>> On Sat, Jul 28, 2012 at 12:18:30AM +0200, Vincent ETIENNE wrote:
>>>> Hello
>>>>
>>>> Get this on first write made ( by deliver sending mail to inform of the
>>>> restart of services )
>>>> Home partition (the one receiving the mail) is based on ocfs2 created
>>>> from drbd block device in primary/primary mode
>>>> These drbd devices are based on lvm.
>>>>
>>>> system is running linux-3.5.0, identical symptom with linux 3.3 and 3.2
>>>> but working with linux 3.0 kernel
>>>>
>>>> reproduced on two machines ( so different hardware involved on this one
>>>> software md raid on SATA, on second one areca hardware raid card )
>>>> but the 2 machines are the one sharing this partition ( so share the
>>>> same data )
>>> Hmm. Any chance you can bisect this further?
>> Will try to. Will take a few days as the server is in production ( but
>> used as backup so...)
>>
>>>> Jul 27 23:41:41 jupiter2 kernel: [ 351.169213] ------------[ cut here
>>>> ]------------
>>>> Jul 27 23:41:41 jupiter2 kernel: [ 351.169261] kernel BUG at
>>>> fs/buffer.c:2886!
>>> This is:
>>>
>>> BUG_ON(!buffer_mapped(bh));
>>>
>>> in submit_bh().
>>>
>>> system_call_fastpath+0x16/0x1b
>>> This stack trace is from 3.5, because of the location of the
>>> BUG. The call path in the trace suggests the code added by Al's ea022d,
>>> but you say it breaks in 3.2 and 3.3 as well. Can you give me a trace
>>> from 3.2?
>> For a 3.2 kernel i get this stack trace. Different trace form 3.5 but
>> exactly at the same moment. and for the same reasons.
>> Seems to be less immmediate than with 3.5 but more a subjective
>> imrpession than something based on fact. ( it takes a few seconds after
>> deliver is started to have the bug )
> Totally different stack trace. Not in symlink code, but instead in
> fallocate. Weird. I wonder if you are hitting two things. Bisection
> will definitely help.
Yes could be, that would explain the 2 stack trace ( and the different
timing observed )
Bisection is in progress. The fallocate bug is certainly already
corrected ( info sent by
sunil.mushran@...il.com but unavailable on the list for the moment ?)
------
The fallocate() oops is probably the same that is fixed by this patch.
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=commit;h=a2118b301104a24381b414bc93371d666fe8d43a
Is in the list of patches that are ready to be pushed.
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=mw-3.4-mar15
----
But not sure it will correct all i observed. So i will continue to
bisect to confirm/infirm.
( But i seems to have lost network on my server after a reboot and so no
more access before tomorrow , I have certainly forget to do make
modules_install before installing new kernel ... Being stupid is not
very helpful... ) . I hope to finish the bisection tomorrow or wednesday.
Thanks a lot for the support.
> Joel
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists