lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 17 Jun 2009 15:23:19 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Minchan Kim <minchan.kim@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andi Kleen <ak@...ux.intel.com>, Ingo Molnar <mingo@...e.hu>,
	Mel Gorman <mel@....ul.ie>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	Andi Kleen <andi@...stfloor.org>,
	"riel@...hat.com" <riel@...hat.com>,
	"chris.mason@...cle.com" <chris.mason@...cle.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH 09/22] HWPOISON: Handle hardware poisoned pages in
	try_to_unmap

On Wed, Jun 17, 2009 at 08:28:26AM +0800, Minchan Kim wrote:
> On Tue, 16 Jun 2009 21:49:44 +0800
> Wu Fengguang <fengguang.wu@...el.com> wrote:
> 
> > On Tue, Jun 16, 2009 at 08:03:08AM +0800, Minchan Kim wrote:
> > > On Mon, 15 Jun 2009 23:26:12 +0800
> > > Wu Fengguang <fengguang.wu@...el.com> wrote:
> > > 
> > > > On Mon, Jun 15, 2009 at 09:09:03PM +0800, Minchan Kim wrote:
> > > > > On Mon, Jun 15, 2009 at 11:45 AM, Wu Fengguang<fengguang.wu@...el.com> wrote:
> > > > > > From: Andi Kleen <ak@...ux.intel.com>
> > > > > >
> > > > > > When a page has the poison bit set replace the PTE with a poison entry.
> > > > > > This causes the right error handling to be done later when a process runs
> > > > > > into it.
> > > > > >
> > > > > > Also add a new flag to not do that (needed for the memory-failure handler
> > > > > > later)
> > > > > >
> > > > > > Reviewed-by: Wu Fengguang <fengguang.wu@...el.com>
> > > > > > Signed-off-by: Andi Kleen <ak@...ux.intel.com>
> > > > > >
> > > > > > ---
> > > > > >  include/linux/rmap.h |    1 +
> > > > > >  mm/rmap.c            |    9 ++++++++-
> > > > > >  2 files changed, 9 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > --- sound-2.6.orig/mm/rmap.c
> > > > > > +++ sound-2.6/mm/rmap.c
> > > > > > @@ -958,7 +958,14 @@ static int try_to_unmap_one(struct page
> > > > > >        /* Update high watermark before we lower rss */
> > > > > >        update_hiwater_rss(mm);
> > > > > >
> > > > > > -       if (PageAnon(page)) {
> > > > > > +       if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> > > > > > +               if (PageAnon(page))
> > > > > > +                       dec_mm_counter(mm, anon_rss);
> > > > > > +               else if (!is_migration_entry(pte_to_swp_entry(*pte)))
> > > > > 
> > > > > Isn't it straightforward to use !is_hwpoison_entry ?
> > > > 
> > > > Good catch!  It looks like a redundant check: the
> > > > page_check_address() at the beginning of the function guarantees that 
> > > > !is_migration_entry() or !is_migration_entry() tests will all be TRUE.
> > > > So let's do this?
> > > It seems you expand my sight :)
> > > 
> > > I don't know migration well.
> > > How page_check_address guarantee it's not migration entry ? 
> > 
> > page_check_address() calls pte_present() which returns the
> > (_PAGE_PRESENT | _PAGE_PROTNONE) bits. While x86-64 defines
> > 
> > #define __swp_entry(type, offset)       ((swp_entry_t) { \
> >                                          ((type) << (_PAGE_BIT_PRESENT + 1)) \
> >                                          | ((offset) << SWP_OFFSET_SHIFT) })
> > 
> > where SWP_OFFSET_SHIFT is defined to the bigger one of
> > max(_PAGE_BIT_PROTNONE + 1, _PAGE_BIT_FILE + 1) = max(8+1, 6+1) = 9.
> > 
> > So __swp_entry(type, offset) := (type << 1) | (offset << 9)
> > 
> > We know that the swap type is 5 bits. So the bit 0 _PAGE_PRESENT and bit 8
> > _PAGE_PROTNONE will all be zero for swap entries.
> >  
> 
> Thanks for kind explanation :)

You are welcome~

> > 
> > > In addtion, If the page is poison while we are going to
> > > migration((PAGE_MIGRATION && migration) == TRUE), we should decrease
> > > file_rss ?
> > 
> > It will die on trying to migrate the poisoned page so we don't care
> > the accounting. But normally the poisoned page shall already be
> 
> 
> Okay. then, how about this ?
> We should not increase file_rss on trying to migrate the poisoned page
> 
> -               else if (!is_migration_entry(pte_to_swp_entry(*pte)))
> +               else if (!(PAGE_MIGRATION && migration))

This is good if we are going to stop the hwpoison page from being
consumed by move_to_new_page(), but I highly doubt we'll ever add
PageHWPoison() checks into the migration code.

Because this race window is small enough:

        TestSetPageHWPoison(p);
                                   lock_page(page);
                                   try_to_unmap(page, TTU_MIGRATION|...);
        lock_page_nosync(p);

such small race windows can be found all over the kernel, it's just
insane to try to fix any of them.

For example, if the newly allocated page get corrupted, this kind of code who
assumes it is the only user of the page (but memory_failure() comes in between
like a ghost) will go BUG():

        /*
         * Block others from accessing the page when we get around to
         * establishing additional references. We are the only one
         * holding a reference to the new page at this point.
         */
        if (!trylock_page(newpage))
                BUG();

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ