linux-ext4 - Re: [PATCHSET v3.1 0/7] data integrity: Stabilize pages during writeback for various fses

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4EA4431A.3010104@amacapital.net>
Date:	Sun, 23 Oct 2011 09:38:50 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Jan Kara <jack@...e.cz>
CC:	OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
	"Darrick J. Wong" <djwong@...ibm.com>,
	Theodore Tso <tytso@....edu>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Jens Axboe <axboe@...nel.dk>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Jeff Layton <jlayton@...hat.com>,
	Dave Chinner <david@...morbit.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Christoph Hellwig <hch@...radead.org>, linux-mm@...ck.org,
	Chris Mason <chris.mason@...cle.com>,
	Joel Becker <jlbec@...lplan.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-ext4@...r.kernel.org, Mingming Cao <mcao@...ibm.com>
Subject: Re: [PATCHSET v3.1 0/7] data integrity: Stabilize pages during writeback
 for various fses

On 05/10/2011 09:22 AM, Jan Kara wrote:
> On Wed 11-05-11 01:12:13, OGAWA Hirofumi wrote:
>> Jan Kara<jack@...e.cz>  writes:
>>
>>>> Did you already consider, to copy only if page was writeback (like
>>>> copy-on-write)? I.e. if page is on I/O, copy, then switch the page for
>>>> writing new data.
>>>    Yes, that was considered as well. We'd have to essentially migrate the
>>> page that is under writeback and should be written to. You are going to pay
>>> the cost of page allocation, copy, increased memory&  cache pressure.
>>> Depending on your backing storage and workload this may or may not be better
>>> than waiting for IO...
>>
>> Maybe possible, but you really think on usual case just blocking is
>> better?
>    Define usual case... As Christoph noted, we don't currently have a real
> practical case where blocking would matter (since frequent rewrites are
> rather rare). So defining what is usual when we don't have a single real
> case is kind of tough ;)
>

I'm a bit late to the party, but I have such a use case.  I have a 
real-time program that generates logs.  There's a thread that makes sure 
that there are always mlocked, MAP_SHARED, writable pages for the logs, 
and under normal (or even very heavy) load, the mlocked pages always 
stay far ahead of the logs.  On 2.6.39, it works great [1].  On 3.0, 
it's unusable -- latencies of 30-100 ms are very common.

In this case, neither throughput nor available memory matter at all -- 
I'm not stressing either.  So copying the pages (especially if they're 
mlocked) would be more than a small percentage win -- it would be the 
difference between great performance and unusability.

I wonder if we want a stronger version of mlock that says "this page 
must not be swapped out and, in addition, ptes must always be mapped 
with all appropriate permission bits set".  (This is only possible with 
hardware dirty and accessed bits, but we could come close even without 
them.)

[1] file_update_time is a problem.  patches coming.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html