lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 4 Apr 2008 11:27:35 +1000
From:	David Chinner <dgc@....com>
To:	Jan Kara <jack@...e.cz>
Cc:	David Chinner <dgc@....com>, lkml <linux-kernel@...r.kernel.org>
Subject: Re: BUG: ext3 hang in transaction commit

On Thu, Apr 03, 2008 at 12:07:42PM +0200, Jan Kara wrote:
>   Hi,
> 
> > ia32 XFS QA machine, ext3 root on a raw partition. 2.6.25-rc3.
> > 
> > kjournald hung journal_commit_transaction():
> > 
> > Stack traceback for pid 2046
> > 0xf6e0b350     2046        2  0    0   D  0xf6e0b550  kjournald
> > esp        eip        Function (args)
> > 0xf68d5e70 0xc04c20b2 schedule+0x51e
> > 0xf68d5ec0 0xc04c2394 io_schedule+0x1d
> > 0xf68d5ecc 0xc0179b64 sync_buffer+0x33 (invalid)
> > 0xf68d5ed4 0xc04c2570 __wait_on_bit+0x36 (0xc2000b78, 0xf68d5f00, 0xc0179b31, 0x2)
> > 0xf68d5ef0 0xc04c25ef out_of_line_wait_on_bit+0x58 (0xde89bc18, 0x2, 0xc0179b31, 0x2)
> > 0xf68d5f2c 0xc0179adf __wait_on_buffer+0x19
> > 0xf68d5f38 0xc01cecba journal_commit_transaction+0x40b (0xf6d94a00)
> > 0xf68d5fa0 0xc01d180a kjournald+0xa4 (0xf6d94a00)
> > 0xf68d5fd4 0xc01301f1 kthread+0x3b (invalid)
> 
>   I suppose this is wait_on_buffer() in line 444 in fs/jbd/commit.c, isn't it?

No idea. I haven't looked at the code....

> > We're waiting on the last page/buffer in the file, and it doesn't appear
> > to be under writeback....
>   We wait for write of ordered-data to finish. Which seems to never
> happen. Page isn't under writeback, but that just means we submitted the
> buffer from the commit code (that doesn't change the page state).
>   Anyway, the cause is that either due to some bug IO never finished and

Yes, I certainly beleive that is possible. We see it often enough with
XFS....

> so buffer never got unlocked, or we somewhere locked the buffer and
> forgot to unlock it (but I've checked all the relevant places and think
> they are correct). The traces of all the processes seem harmless - I see
> no place trace where we are holding a buffer lock.
>   If you happen to hit this again, please let me know and I'll look into
> it further...

Will do.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ