lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <463AE32A.5000902@clusterfs.com>
Date:	Fri, 04 May 2007 11:39:22 +0400
From:	Alex Tomas <alex@...sterfs.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	Andreas Dilger <adilger@...sterfs.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Marat Buharov <marat.buharov@...il.com>,
	Mike Galbraith <efault@....de>,
	LKML <linux-kernel@...r.kernel.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when
 FS is under heavy write load (massive starvation)

Andrew Morton wrote:
> I'm still not understanding.  The terms you're using are a bit ambiguous.
> 
> What does "find some dirty unallocated blocks" mean?  Find a page which is
> dirty and which does not have a disk mapping?
> 
> Normally the above operation would be implemented via
> ext4_writeback_writepage(), and it runs under lock_page().

I'm mostly worried about delayed allocation case. My impression was that
holding number of pages locked isn't a good idea, even if they're locked
in index order. so, I was going to turn number of pages writeback, then
allocate blocks for all of them at once, then put proper blocknr's into
bh's (or PG_mappedtodisk?).

> 
> 
>> 					going to commit
>> 					find inode I dirty
>> 					do NOT find these blocks because they're
>> 					  allocated only, but pages/bhs aren't mapped
>> 					  to them
>> 					start commit
> 
> I think you're assuming here that commit would be using ->t_sync_datalist
> to locate dirty buffer_heads.

nope, I mean sb->inode->page walk.

> But under this proposal, t_sync_datalist just gets removed: the new
> ordered-data mode _only_ need to do the sb->inode->page walk.  So if I'm
> understanding you, the way in which we'd handle any such race is to make
> kjournald's writeback of the dirty pages block in lock_page().  Once it
> gets the page lock it can look to see if some other thread has mapped the
> page to disk.

if I'm right holding number of pages locked, then they won't be locked, but
writeback. of course kjournald can block on writeback as well, but how does
it find pages with *newly allocated* blocks only?

> It may turn out that kjournald needs a private way of getting at the
> I_DIRTY_PAGES inodes to do this properly, but I don't _think_ so.  If we
> had the radix-tree-of-dirty-inodes thing then that's easy enough to do
> anyway, with a tagged search.  But I expect that a single pass through the
> superblock's dirty inodes would suffice for ordered-data.  Files which
> have chattr +j would screw things up, as usual.

not dirty inodes only, but rather some fast way to find pages with newly
allocated pages.

> I assume (hope) that your delayed allocation code implements
> ->writepages()?  Doing the allocation one-page-at-a-time sounds painful...

indeed. this is a root cause of all this complexity.

thanks, Alex


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ