linux-kernel - Re: [PATCHSET v3.1 0/7] data integrity: Stabilize pages during writeback for various fses

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110517012307.GQ20579@tux1.beaverton.ibm.com>
Date:	Mon, 16 May 2011 18:23:07 -0700
From:	"Darrick J. Wong" <djwong@...ibm.com>
To:	OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Cc:	Jan Kara <jack@...e.cz>, Theodore Tso <tytso@....edu>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Jens Axboe <axboe@...nel.dk>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Jeff Layton <jlayton@...hat.com>,
	Dave Chinner <david@...morbit.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Christoph Hellwig <hch@...radead.org>, linux-mm@...ck.org,
	Chris Mason <chris.mason@...cle.com>,
	Joel Becker <jlbec@...lplan.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-ext4@...r.kernel.org, Mingming Cao <mcao@...ibm.com>
Subject: Re: [PATCHSET v3.1 0/7] data integrity: Stabilize pages during
	writeback for various fses

On Tue, May 17, 2011 at 04:31:37AM +0900, OGAWA Hirofumi wrote:
> "Darrick J. Wong" <djwong@...ibm.com> writes:
> 
> >> OK. E.g. usual workload on desktop, but FS like ext2/fat.
> >
> > In the frequent rewrite case, here's what you get:
> >
> > Regular disk: (possibly garbage) write, followed by a second write to make the
> > disk reflect memory contents.
> >
> > RAID w/ shadow pages: two writes, both consistent.  Higher memory consumption.
> >
> > T10 DIF disk: disk error any time the CPU modifies a page that the disk
> > controller is DMA'ing out of memory.  I suppose one could simply retry the
> > operation if the page is dirty, but supposing memory writes are happening fast
> > enough that the retries also produce disk errors, _nothing_ ever gets written.
> >
> > With the new stable-page-writes patchset, the garbage write/disk error symptoms
> > go away since the processes block instead of creating this window where it's
> > not clear whether the disk's copy of the data is consistent.  I could turn the
> > wait_on_page_writeback calls into some sort of page migration if the
> > performance turns out to be terrible, though I'm still working on quantifying
> > the impact.  Some people pointed out that sqlite tends to write the same blocks
> > frequently, though I wonder if sqlite actually tries to write memory pages
> > while syncing them?
> >
> > One use case where I could see a serious performance hit happening is the case
> > where some app writes a bunch of memory pages, calls sync to force the dirty
> > pages to disk, and /must/ resume writing those memory pages before the sync
> > completes.  The page migration would of course help there, provided a memory
> > page can be found in less time than an I/O operation.
> >
> > Someone commented on the LWN article about this topic, claiming that he had a
> > program that couldn't afford to block on writes to mlock()'d memory.  I'm not
> > sure how to fix that program, because if memory writes never coordinate with
> > disk writes and the other threads are always writing memory, I wonder how the
> > copy on disk isn't always indeterminate.
> 
> I'm not thinking data page is special operation for doing this (at least
> logically). In other word, if you are talking about only data page, you
> shouldn't send patches for metadata with it.

Patch 7, which is the only patch that touches code under fs/fat/, only fixes
the case where the filesystem tries to modify its own metadata while writing
out the same metadata.

Patches 2 and 3, which affect only files mm/ and fs/, prevent data pages from
being modified while the same data pages are being written out.  It is not
necessary to modify any fs/fat/ code to fix the data page case, fortunately.

That said, the intent of the patch set is to prevent writes to any memory page,
regardless of type (data or metadata), while the same page is being written out.

--D
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/