linux-ext4 - Re: [PATCH 3/3] ext4: update writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4CC6EC4A.9070702@redhat.com>
Date:	Tue, 26 Oct 2010 09:57:14 -0500
From:	Eric Sandeen <sandeen@...hat.com>
To:	"Ted Ts'o" <tytso@....edu>
CC:	ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 3/3] ext4: update writeback_index based on last page scanned

Ted Ts'o wrote:
> On Mon, Oct 25, 2010 at 04:39:10PM -0500, Eric Sandeen wrote:
>> Not compilebench specifically, but I did do some benchmarking
>> with out of cache buffered IO; to be honest I didn't see
>> striking performance differences, but I did see the writeback
>> behave better in terms of not wandering all over, even if it
>> might recover well.
>>
>> I can try compilebench; do you have specific concerns?
> 
> My specific concern is that what happens if __mpage_da_writepage()
> accumulates 200 pages, but then we were only able to accumulate 50
> pages, and we only write 50 pages.

Be patient with me, but how do we accumulate 200 pages but then only
accumulate 50 pages?

> In the long run what I really want to do is to not call
> clear_page_dirty_for_io() in the renamed write_cache_pages_da(), but
> rather be as greedy as possible about finding dirty/delayed allocate
> pages, and then try to allocate pages for all of them.
> 
> We would then scan the pages for PAGECACHE_TAG_TOWRITE in
> mpage_submit_data_io(), and then write out whatever number of pages we
> need.  At that point we will be a good citizen and writing back what
> the writeback system asks of us --- but we'll be allocating as much
> pages as possible so that the block allocations are sane.  (At that
> point we may find out that the core writeback is screwing us because
> it's not asking us to write back enough; note that XFS decides on its
> own how many pages to writeback in the first call to xfs_writepage(),
> and even if writeback is silly enough to think that XFS should write
> 4MB, then switch to another inode, write 4MB, then write to another
> inode, etc., XFS ignores what writeback asks of it.  But we'll cross
> that road when we get to it....)

Since it works for xfs, we should probably try that direction, but my
early feeble attempts got bogged down in a lot of tangled code.

> So the bottom line is that I believe that what we were doing before is
> wrong; and what we're doing is still wrong, even after your patches.
> I just want to make sure that our performance doesn't go crashing
> through the floor in order to avoid the livelock problem.  (Which I
> agree is a real problem, but we've lived it for quite a while, and I
> haven't seen any evidence of it showing up in production.)

Well, I have a partner-filed bug for that one so I'm motivated ;)

Why would fsync writing only TOWRITE tagged pages cause performance
to crash through the floor?

Note patch 3 doesn't really require patch 2 or vice versa, they
are addressing 2 pretty orthogonal things.

-Eric

> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html