linux-kernel - Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID:  <yw1xmywokfik.fsf@thrashbarg.mansr.com>
Date:	Sat, 18 Aug 2007 16:51:15 +0100
From:	Måns Rullgård <mans@...sr.com>
To:	linux-kernel@...r.kernel.org
Subject:  Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

Chris Boot <bootc@...tc.net> writes:

> Måns Rullgård wrote:
>> Chris Boot <chris.boot@...custree.com> writes:
>>
>>
>>> All,
>>>
>>> I've got a box running RHEL5 and haven't been impressed by ext3
>>> performance on it (running of a 1.5TB HP MSA20 using the cciss
>>> driver). I compiled XFS as a module and tried it out since I'm used to
>>> using it on Debian, which runs much more efficiently. However, every
>>> so often the kernel panics as below. Apologies for the tainted kernel,
>>> but we run VMware Server on the box as well.
>>>
>>> Does anyone have any hits/tips for using XFS on Red Hat? What's
>>> causing the panic below, and is there a way around this?
>>>
>>> BUG: unable to handle kernel paging request at virtual address b8af9d60
>>> printing eip:
>>> c0415974
>>> *pde = 00000000
>>> Oops: 0000 [#1]
>>> SMP last sysfs file: /block/loop7/dev
[...]
>>> [<f936884e>] xfsbufd_wakeup+0x28/0x49 [xfs]
>>> [<c04572f9>] shrink_slab+0x56/0x13c
>>> [<c0457c0c>] try_to_free_pages+0x162/0x23e
>>> [<c0454064>] __alloc_pages+0x18d/0x27e
>>> [<c045214e>] find_or_create_page+0x53/0x8c
>>> [<c046c7b1>] __getblk+0x162/0x270
>>> [<c0475be0>] do_lookup+0x53/0x157
>>> [<f889138f>] ext3_getblk+0x7c/0x233 [ext3]
>>> [<f88913fe>] ext3_getblk+0xeb/0x233 [ext3]
>>> [<c048215c>] mntput_no_expire+0x11/0x6a
>>> [<f889226e>] ext3_bread+0x13/0x69 [ext3]
>>> [<f8895606>] htree_dirblock_to_tree+0x22/0x113 [ext3]
>>> [<f889574f>] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
>>> [<c047828b>] do_path_lookup+0x20e/0x25f
>>> [<c046b987>] get_empty_filp+0x99/0x15e
>>> [<f889d611>] ext3_permission+0x0/0xa [ext3]
>>> [<f888eaa3>] ext3_readdir+0x1ce/0x59b [ext3]
>>> [<c047a0dd>] filldir+0x0/0xb9
>>> [<c0472973>] sys_fstat64+0x1e/0x23
>>> [<c047a1f9>] vfs_readdir+0x63/0x8d
>>> [<c047a0dd>] filldir+0x0/0xb9
>>> [<c047a447>] sys_getdents+0x5f/0x9c
>>> [<c0403eff>] syscall_call+0x7/0xb
>>> =======================
>>>
>>
>> Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
>> seems to be enough to overflow it.
>>
> Thanks, that explains a lot. However, I don't have any XFS filesystems
> mounted over loop devices on ext3. Earlier in the day I had iso9660 on
> loop on xfs, could that have caused the issue? It was unmounted and
> deleted when this panic occurred.

The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.

> I'll probably just try and recompile the kernel with 8k stacks and see
> how it goes. Screw the support, we're unlikely to get it anyway. :-P

Please report how this works out.

-- 
Måns Rullgård
mans@...sr.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/