linux-kernel - RE: Question about ext3 jbd module

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA502B3E9EE27B4490C87C12E3C7C85111D033@pdsmsx412.ccr.corp.intel.com>
Date:	Tue, 25 Jul 2006 17:59:44 +0800
From:	"Mao, Bibo" <bibo.mao@...el.com>
To:	"Andrew Morton" <akpm@...l.org>
Cc:	<linux-kernel@...r.kernel.org>, <ext2-devel@...ts.sourceforge.net>
Subject: RE: Question about ext3 jbd module

Yes, kernel version is 2.6.9, it is OS distribution kernel RHEL4. I run LTP stress test, which includes memory, file system etc stress test. And my machine has 4 physical IA64 CPU with dual core and hyptherthread function. I ever added printk information in the head of function journal_dirty_metadata(), jh value is NULL.

The LTP tool version is ltp-full-20060412, test case is testscripts/ltpstress.sh, and stress test will crash with/without patch in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158363,

Thanks
Bibo,mao

>-----Original Message-----
>From: Andrew Morton [mailto:akpm@...l.org]
>Sent: 2006年7月25日 17:05
>To: Mao, Bibo
>Cc: linux-kernel@...r.kernel.org; ext2-devel@...ts.sourceforge.net
>Subject: Re: Question about ext3 jbd module
>
>On Tue, 25 Jul 2006 08:06:02 +0000
>"bibo, mao" <bibo.mao@...el.com> wrote:
>
>> Hi,
>>    When I run LTP stress test on my IA64 box based on ditribution
>> kernel version, kernel will crash within three days, I think this
>> problem should exist in recent kernel also, but I does not trigger
>> this. The problem is in function journal_dirty_metadata(),
>> there is one sentence like this:
>> 	struct journal_head *jh = bh2jh(bh);
>> >From my debug result, jh is NULL at this point, I do not know whether
>> it is because of contention or kernel forgets to consider NULL pointer
>> condition.
>>
>> I wrote one patch, after this patch LTP stress test does not crash, but
>> I am not familiar filesystem, I do not know whether it is the root cause
>> or what negative influence this patch will bring out.
>>
>> Thanks
>> bibo,mao
>>
>> --- linux-2.6.9/fs/jbd/transaction.c.orig       2006-06-30
>14:05:58.000000000 +0800
>> +++ linux-2.6.9/fs/jbd/transaction.c    2006-07-07 02:56:32.000000000 +0800
>> @@ -1104,13 +1104,15 @@ int journal_dirty_metadata(handle_t *han
>>  {
>>         transaction_t *transaction = handle->h_transaction;
>>         journal_t *journal = transaction->t_journal;
>> -       struct journal_head *jh = bh2jh(bh);
>> +       struct journal_head *jh;
>>
>> -       jbd_debug(5, "journal_head %p\n", jh);
>> -       JBUFFER_TRACE(jh, "entry");
>>         if (is_handle_aborted(handle))
>>                 goto out;
>>
>> +       jh = journal_add_journal_head(bh);
>> +       jbd_debug(5, "journal_head %p\n", jh);
>> +       JBUFFER_TRACE(jh, "entry");
>> +
>>         jbd_lock_bh_state(bh);
>>
>>         /*
>> @@ -1154,6 +1156,7 @@ int journal_dirty_metadata(handle_t *han
>>         spin_unlock(&journal->j_list_lock);
>>  out_unlock_bh:
>>         jbd_unlock_bh_state(bh);
>> +       journal_put_journal_head(jh);
>>  out:
>>         JBUFFER_TRACE(jh, "exit");
>>         return 0;
>>
>
>That's a worry.  We've attached a journal_head and we've done
>do_get_write_access() and we're now proceeding to journal the buffer as
>metadata but someone has presumably gone and run
>__journal_try_to_free_buffer() against the thing and has stolen our
>journal_head.
>
>Simply reattaching a new journal_head is most likely wrong - we'll lose
>whatever state was supposed to be in the old one (like, which journal list
>this bh+jh is on).
>
>Somewhere, somehow, that journal_head has passed through a state which
>permitted __journal_try_to_free_buffer() to free it while appropriate locks
>were not held.  I wonder where.
>
>Your diff headers claim to be against 2.6.9.  Is that so?
>
>Would it be correct to assume that there was some page replacement pressure
>happening at the time?
>
>It looks like a big box - can you describe it a bit please?
>
>> Pid: 28417, CPU 13, comm:              inode02
>> psr : 0000121008126010 ifs : 800000000000040d ip  : [<a000000200134721>]
>Not tainted
>> ip is at journal_dirty_metadata+0x2c1/0x5e0 [jbd]
>> unat: 0000000000000000 pfs : 0000000000000917 rsc : 0000000000000003
>> rnat: 0000000000000000 bsps: 0000000000000000 pr  : 005965a026595569
>> ldrs: 0000000000000000 ccv : 0000000000060011 fpsr: 0009804c8a70033f
>> csd : 0000000000000000 ssd : 0000000000000000
>> b0  : a0000002001d3350 b6  : a000000100589f20 b7  : a0000001001fee60
>> f6  : 1003e0000000000000000 f7  : 1003e0000000000000080
>> f8  : 1003e00000000000008c1 f9  : 1003effffffffffffc0a0
>> f10 : 100049c8d719c0533ddf0 f11 : 1003e00000000000008c1
>> r1  : a000000200330000 r2  : a000000200158630 r3  : a000000200158630
>> r8  : e0000001d885631c r9  : 0000000000000000 r10 : e0000001bede66e0
>> r11 : 0000000000000010 r12 : e0000001a8a67d40 r13 : e0000001a8a60000
>> r14 : 0000000000060011 r15 : 0000000000000000 r16 : e0000001fef76e80
>> r17 : e0000001e3e34b38 r18 : 0000000000000020 r19 : 0000000000060011
>> r20 : 00000000000e0011 r21 : 0000000000060011 r22 : 0000000000000000
>> r23 : 0000000000000000 r24 : 0000000000000000 r25 : e0000001e6cc8090
>> r26 : e0000001d88561e0 r27 : 0000000044bc2661 r28 : e0000001d8856078
>> r29 : 000000007c6fe61a r30 : e0000001e6cc80e8 r31 : e0000001d8856080
>>
>> Call Trace:
>>  [<a000000100016b20>] show_stack+0x80/0xa0
>>                                 sp=e0000001a8a67750 bsp=e0000001a8a61360
>>  [<a000000100017430>] show_regs+0x890/0x8c0
>>                                 sp=e0000001a8a67920 bsp=e0000001a8a61318
>>  [<a00000010003dbf0>] die+0x150/0x240
>>                                 sp=e0000001a8a67940 bsp=e0000001a8a612d8
>>  [<a00000010003dd20>] die_if_kernel+0x40/0x60
>>                                 sp=e0000001a8a67940 bsp=e0000001a8a612a8
>>  [<a00000010003f930>] ia64_fault+0x1450/0x15a0
>>                                 sp=e0000001a8a67940 bsp=e0000001a8a61250
>>  [<a00000010000f540>] ia64_leave_kernel+0x0/0x260
>>                                 sp=e0000001a8a67b70 bsp=e0000001a8a61250
>>  [<a000000200134720>] journal_dirty_metadata+0x2c0/0x5e0 [jbd]
>>                                 sp=e0000001a8a67d40 bsp=e0000001a8a611e0
>>  [<a0000002001d3350>] ext3_mark_iloc_dirty+0x750/0xc00 [ext3]
>>                                 sp=e0000001a8a67d40 bsp=e0000001a8a61150
>>  [<a0000002001d3a90>] ext3_mark_inode_dirty+0xb0/0xe0 [ext3]
>>                                 sp=e0000001a8a67d40 bsp=e0000001a8a61128
>>  [<a0000002001ce3d0>] ext3_new_inode+0x1670/0x1a80 [ext3]
>>                                 sp=e0000001a8a67d60 bsp=e0000001a8a61068
>>  [<a0000002001df5e0>] ext3_mkdir+0x120/0x940 [ext3]
>>                                 sp=e0000001a8a67da0 bsp=e0000001a8a61008
>>  [<a000000100148250>] vfs_mkdir+0x250/0x380
>>                                 sp=e0000001a8a67db0 bsp=e0000001a8a60fb0
>>  [<a0000001001484f0>] sys_mkdir+0x170/0x280
>>                                 sp=e0000001a8a67db0 bsp=e0000001a8a60f30
>>  [<a00000010000f3e0>] ia64_ret_from_syscall+0x0/0x20
>>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/