linux-kernel - [PATCH 0/5] jbd: possible filesystem corruption fixes (take 2)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <4843CE15.6080506@hitachi.com>
Date:	Mon, 02 Jun 2008 19:40:21 +0900
From:	Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>
To:	akpm@...ux-foundation.org, sct@...hat.com, adilger@...sterfs.com
Cc:	linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
	jack@...e.cz, jbacik@...hat.com, cmm@...ibm.com, tytso@....edu,
	sugita <yumiko.sugita.yf@...achi.com>,
	Satoshi OSHIMA <satoshi.oshima.fk@...achi.com>
Subject: [PATCH 0/5] jbd: possible filesystem corruption fixes (take 2)

Subject: [PATCH 0/5] jbd: possible filesystem corruption fixes (take 2)

This patch set is the take 2 of fixing error handling problem in
ext3/JBD.  The previous discussion can be found here:
http://lkml.org/lkml/2008/5/14/10

The same problem should also be in ext4/JBD, but I haven't prepared
it yet.

Problem
=======
Currently some error checkings are missing, so the journal cannot abort
correctly.  This causes breakage of the ordered mode rule and filesystem
corruption.  Missing error checkings are:

(1) error check for dirty buffers flushed before the commit
    (addressed by PATCH 1/5 and 2/5)
(2) error check for the metadata writes to the journal before the
    commit (addressed by PATCH 3/5)
(3) error check for checkpointing and replay (addressed by PATCH 4/5
    and 5/5)

Changes from take 1
===================
[PATCH 1/5]
o not changed

[PATCH 2/5]
o rewrite my coment in journal_dirty_data() comprehensibly

[PATCH 3/5]
o check for errors and abort the journal just before
  journal_write_commit_record() instead of after writing metadata
  buffers

[PATCH 4/5 and 5/5]
o separate the ext3 part from the jbd part in a patch
o use JFS_ABORT for checkpointing failures instead of introducing
  JFS_CP_ABORT flag
o don't update only the journal super block, but also j_tail and
  j_tail_sequence when the journal has aborted (at least we only
  have to avoid updating the super block, but keeping j_tail*'s
  values will be good thing because it may protect someone from
  adding bugs in the future)
o journal_destroy() returns -EIO when the journal has aborted so that
  ext3_put_super() can detect the abort
o journal_flush() uses j_checkpoint_mutex to avoid a race with
  __log_wait_for_space()

The last item targets a newly found problem.  journal_flush() can be
called while processing __log_wait_for_space().  In this case,
cleanup_journal_tail() can be called between
__journal_drop_transaction() and journal_abort(), then 
the transaction with checkpointing failure is lost from the journal.
Using j_checkpoint_mutex which is used by __log_wait_for_space(),
we should avoid the race condition.  But the test is not so sufficient
because it is very difficult to produce this race.  So I hope that
this locking is reviewed carefully (including a possibility of
deadlock.)

Regards,

-- 
Hidehiro Kawai
Hitachi, Systems Development Laboratory
Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/