linux-ext4 - Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070504001802.0e86e9dd.akpm@linux-foundation.org>
Date:	Fri, 4 May 2007 00:18:02 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Alex Tomas <alex@...sterfs.com>
Cc:	Andreas Dilger <adilger@...sterfs.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Marat Buharov <marat.buharov@...il.com>,
	Mike Galbraith <efault@....de>,
	LKML <linux-kernel@...r.kernel.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS
 is under heavy write load (massive starvation)

On Fri, 04 May 2007 10:57:12 +0400 Alex Tomas <alex@...sterfs.com> wrote:

> Andrew Morton wrote:
> > On Fri, 04 May 2007 10:18:12 +0400 Alex Tomas <alex@...sterfs.com> wrote:
> > 
> >> Andrew Morton wrote:
> >>> Yes, there can be issues with needing to allocate journal space within the
> >>> context of a commit.  But
> >> no-no, this isn't required. we only need to mark pages/blocks within
> >> transaction, otherwise race is possible when we allocate blocks in transaction,
> >> then transacton starts to commit, then we mark pages/blocks to be flushed
> >> before commit.
> > 
> > I don't understand.  Can you please describe the race in more detail?
> 
> if I understood your idea right, then in data=ordered mode, commit thread writes
> all dirty mapped blocks before real commit.
> 
> say, we have two threads: t1 is a thread doing flushing and t2 is a commit thread
> 
> t1					t2
> find dirty inode I
> find some dirty unallocated blocks
> journal_start()
> allocate blocks
> attach them to I
> journal_stop()

I'm still not understanding.  The terms you're using are a bit ambiguous.

What does "find some dirty unallocated blocks" mean?  Find a page which is
dirty and which does not have a disk mapping?

Normally the above operation would be implemented via
ext4_writeback_writepage(), and it runs under lock_page().

> 					going to commit
> 					find inode I dirty
> 					do NOT find these blocks because they're
> 					  allocated only, but pages/bhs aren't mapped
> 					  to them
> 					start commit

I think you're assuming here that commit would be using ->t_sync_datalist
to locate dirty buffer_heads.

But under this proposal, t_sync_datalist just gets removed: the new
ordered-data mode _only_ need to do the sb->inode->page walk.  So if I'm
understanding you, the way in which we'd handle any such race is to make
kjournald's writeback of the dirty pages block in lock_page().  Once it
gets the page lock it can look to see if some other thread has mapped the
page to disk.

It may turn out that kjournald needs a private way of getting at the
I_DIRTY_PAGES inodes to do this properly, but I don't _think_ so.  If we
had the radix-tree-of-dirty-inodes thing then that's easy enough to do
anyway, with a tagged search.  But I expect that a single pass through the
superblock's dirty inodes would suffice for ordered-data.  Files which
have chattr +j would screw things up, as usual.

I assume (hope) that your delayed allocation code implements
->writepages()?  Doing the allocation one-page-at-a-time sounds painful...

> 
> map pages/bhs to just allocate blocks
> 
> 
> so, either we mark pages/bhs someway within journal_start()--journal_stop() or
> commit thread should do lookup for all dirty pages. the latter doesn't sound nice, IMHO.
> 

I don't think I'm understanding you fully yet.
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html