linux-ext4 - RE: [PATCH 2/2] jbd: fix fsync() tid wraparound bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BCB84D936723884B91E4CC5CA0A7C54AA4F95A9F32@EMEADALEXMB1.corp.nai.org>
Date:	Wed, 4 May 2011 09:21:04 -0500
From:	<Martin_Zielinski@...fee.com>
To:	<tytso@....edu>, <jack@...e.cz>
CC:	<linux-ext4@...r.kernel.org>
Subject: RE: [PATCH 2/2] jbd: fix fsync() tid wraparound bug

Here's an update.
In my first post I was not aware of the implementation of tid_gt.
I agree that 2 and a half billion commits on an SD card are - hmph - unlikely

The kernels with the patch is not yet installed because we recently provided a kernel with a panic() instead of J_ASSERT and a crashkernel. 
We now have 5 cores all showing the same behavior in j_commit_sequence vs. j_commit_request.

What the cores show (snipped):
(gdb) bt
#0  crash_setup_regs (regs=0x0) at /usr/src/debug/linux-2.6.32/arch/x86/include/asm/kexec.h:127
#1  crash_kexec (regs=0x0) at kernel/kexec.c:1076
#2  0xffffffff81048872 in panic (fmt=0xffffffff818bdb95 "running transaction is NULL")
    at kernel/panic.c:82
#3  0xffffffff81159b79 in journal_commit_transaction (journal=0xffff88031e6c4600)
    at fs/jbd/commit.c:347
#4  0xffffffff8115de49 in kjournald (arg=<value optimized out>) at fs/jbd/journal.c:150
#5  0xffffffff8105fccf in kthread (_create=<value optimized out>) at kernel/kthread.c:78
#6  0xffffffff8100c9da in ?? ()
#7  0x0000000000000000 in ?? ()

gdb) p *journal
$4 = {j_flags = 16, j_errno = 0, j_sb_buffer = 0xffff88031f156dc8, 
  j_superblock = 0xffff88031f876000, j_format_version = 2, j_state_lock = {raw_lock = {
      slock = 2874125135}}, j_barrier_count = 0, j_barrier = {count = {counter = 1}, wait_lock = {
      raw_lock = {slock = 0}}, wait_list = {next = 0xffff88031e6c4638, 
      prev = 0xffff88031e6c4638}, owner = 0x0}, j_running_transaction = 0x0, 
  j_committing_transaction = 0x0, j_checkpoint_transactions = 0xffff88031bd16b40,
...
j_tail_sequence = 2288011385, j_transaction_sequence = 2288014068, 
  j_commit_sequence = 2288014067, j_commit_request = 140530417,
...
  j_wbuf = 0xffff88031de98000, j_wbufsize = 512, j_last_sync_writer = 4568, 
  j_average_commit_time = 69247, j_private = 0xffff88031fd49400}

Means that j_commit_request would be considered to be greater than j_commit_sequence and trigger a commit.

The j_last_sync_writer has a promising backtrace:

crash> bt 4568
PID: 4568   TASK: ffff8802c38ce300  CPU: 8   COMMAND: "core"
 #0 [ffff8802c38d1da8] schedule at ffffffff8159f5e8
 #1 [ffff8802c38d1e50] log_wait_commit at ffffffff8115d4a1
 #2 [ffff8802c38d1ec0] ext3_sync_file at ffffffff8113c589
 #3 [ffff8802c38d1ef0] vfs_fsync_range at ffffffff81105588
 #4 [ffff8802c38d1f30] vfs_fsync at ffffffff81105612
 #5 [ffff8802c38d1f40] do_fsync at ffffffff81105646
 #6 [ffff8802c38d1f70] sys_fdatasync at ffffffff8110566a
 #7 [ffff8802c38d1f80] system_call_fastpath at ffffffff8100ba2b
    RIP: 00007f5be9bd6587  RSP: 0000000047386f38  RFLAGS: 00010202
    RAX: 000000000000004b  RBX: ffffffff8100ba2b  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 00000000000000f7
    RBP: 00007f5aecf23cc8   R8: 00007f5bc54e56a0   R9: 00000000000011d8
    R10: 0000000001c01800  R11: 0000000000000202  R12: ffffffff8110566a
    R13: ffff8802c38d1f78  R14: 00007f5aecf23b98  R15: 0000000000000000
    ORIG_RAX: 000000000000004b  CS: 0033  SS: 002b

To make a long story short: The inode did not carry the j_commit_request tid.

But:
crash> foreach files
...
247 ffff88031e65bb00 ffff88026c3ea540 ffff88031f0c0758 REG  /opt/data/users/user.db
...
(this is an sqlite database)

And:
(gdb) p ((struct ext3_inode_info*)(0xffff88031f0c0758-0xd0))->i_sync_tid
$5 = {counter = -2006954411}
(gdb) p ((struct ext3_inode_info*)(0xffff88031f0c0758-0xd0))->i_datasync_tid
$3 = {counter = 140530417}

> j_commit_request = 140530417

So it *is* a datasync from sqlite. And your fix will catch it. 
I still don't understand, where this number comes from. 

Unfortunately I cannot provide the core files as they are coming from customers. 
But if there is anything I can do to provide further information, please let me know.

Cheers,
Martin

-----Original Message-----
From: Ted Ts'o [mailto:tytso@....edu] 
Sent: Montag, 2. Mai 2011 23:31
To: Jan Kara
Cc: Ext4 Developers List; Zielinski, Martin
Subject: Re: [PATCH 2/2] jbd: fix fsync() tid wraparound bug

On Mon, May 02, 2011 at 09:04:41PM +0200, Jan Kara wrote:
>   Ah, OK. I didn't know it's so many reports. Indeed it looks like there's
> another bug. But then the WARN you have added should tell us more about
> who's causing the problem, right?

That's the hope.  I'm not sure WARNs get logged in Android's reporting
system.  I need to work the Android folks to figure this out.  Worst
case I may have to turn the WARN into a BUG for them (although it may
be tough to convince them to make such a change).  I'm hoping though
that the debugging information from Martin's machines will also come
through with some useful information that will allow us to root cause
whatever is causing this to happen.

							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html