lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090408133115.GB11041@sgi.com>
Date:	Wed, 8 Apr 2009 08:31:15 -0500
From:	Russ Anderson <rja@....com>
To:	Ingo Oeser <ioe-lkml@...eria.de>
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
	Andi Kleen <andi@...stfloor.org>, rja@....com
Subject: Re: [PATCH 1/2] Avoid putting a bad page back on the LRU

On Wed, Apr 08, 2009 at 05:43:15AM +0200, Ingo Oeser wrote:
> Hi Russ,
> 
> On Wednesday 08 April 2009, Russ Anderson wrote:
> > --- linux-next.orig/mm/migrate.c	2009-04-07 18:32:12.781949840 -0500
> > +++ linux-next/mm/migrate.c	2009-04-07 18:34:19.169736260 -0500
> > @@ -693,6 +696,26 @@ unlock:
> >   		 * restored.
> >   		 */
> >   		list_del(&page->lru);
> > +#ifdef CONFIG_MEMORY_FAILURE
> > +		if (PagePoison(page)) {
> > +			if (rc == 0)
> > +				/*
> > +				 * A page with a memory error that has
> > +				 * been migrated will not be moved to
> > +				 * the LRU.
> > +				 */
> > +				goto move_newpage;
> > +			else
> > +				/*
> > +				 * The page failed to migrate and will not
> > +				 * be added to the bad page list.  Clearing
> > +				 * the error bit will allow another attempt
> > +				 * to migrate if it gets another correctable
> > +				 * error.
> > +				 */
> > +				ClearPagePoison(page);
> 
> Clearing the flag doesn't change the fact, that this page is representing 
> permanently bad RAM.

Yes, but this is intended for corrected memory errors (meaning there is
an underlying RAM error, but has not reached the point of losing data).

After talking with Andi, it is clear the intent of the Poison flag
(uncorrectable memory error) is different from my intent (corrected
memory error).  I'll go back to using a different page flag to avoid
confusing the two issues.
 
> What about removing it from the LRU and adding it to a bad RAM list in every case?

That is what happens when the page migrates (the normal case).  The else case 
s when the page could not be migrated.  My intent was to wait for the next
corrected error on that page and try migrating again.

> After hot swapping the physical RAM banks it could be moved back, not before.

As soon as the code is written.  :-)

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@....com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ