linux-kernel - Re: [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090528145021.GA5503@localhost>
Date:	Thu, 28 May 2009 22:50:21 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Nick Piggin <npiggin@...e.de>,
	"hugh@...itas.com" <hugh@...itas.com>,
	"riel@...hat.com" <riel@...hat.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"chris.mason@...cle.com" <chris.mason@...cle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH] [13/16] HWPOISON: The high level memory error handler
	in the VM v3

On Thu, May 28, 2009 at 09:45:20PM +0800, Andi Kleen wrote:
> On Thu, May 28, 2009 at 02:08:54PM +0200, Nick Piggin wrote:

[snip]

> > 
> > BTW. I don't know if you are checking for PG_writeback often enough?
> > You can't remove a PG_writeback page from pagecache. The normal
> > pattern is lock_page(page); wait_on_page_writeback(page); which I
> 
> So pages can be in writeback without being locked? I still
> wasn't able to find such a case (in fact unless I'm misreading
> the code badly the writeback bit is only used by NFS and a few  
> obscure cases)

Yes the writeback page is typically not locked. Only read IO requires
to be exclusive. Read IO is in fact page *writer*, while writeback IO
is page *reader* :-)

The writeback bit is _widely_ used.  test_set_page_writeback() is
directly used by NFS/AFS etc. But its main user is in fact
set_page_writeback(), which is called in 26 places.

> > think would be safest 
> 
> Okay. I'll just add it after the page lock.
> 
> > (then you never have to bother with the writeback bit again)
> 
> Until Fengguang does something fancy with it.

Yes I'm going to do it without wait_on_page_writeback().

The reason truncate_inode_pages_range() has to wait on writeback page
is to ensure data integrity. Otherwise if there comes two events:
        truncate page A at offset X
        populate page B at offset X
If A and B are all writeback pages, then B can hit disk first and then
be overwritten by A. Which corrupts the data at offset X from user's POV.

But for hwpoison, there are no such worries. If A is poisoned, we do
our best to isolate it as well as intercepting its IO. If the interception
fails, it will trigger another machine check before hitting the disk.

After all, poisoned A means the data at offset X is already corrupted.
It doesn't matter if there comes another B page.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/