lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1225128359.12673.101.camel@nimitz>
Date:	Mon, 27 Oct 2008 10:25:59 -0700
From:	Dave Hansen <dave@...ux.vnet.ibm.com>
To:	gerald.schaefer@...ibm.com
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, schwidefsky@...ibm.com,
	heiko.carstens@...ibm.com, kamezawa.hiroyu@...fujitsu.com,
	y-goto@...fujitsu.com
Subject: Re: [PATCH] memory hotplug: fix page_zone() calculation in
	test_pages_isolated()

On Mon, 2008-10-27 at 17:49 +0100, Gerald Schaefer wrote:
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
> points to the first pfn after end_pfn, which may be in a different zone
> or not present at all. This may lead to an addressing exception in
> page_zone() or spin_lock_irqsave().

I'm not sure I follow.  Let's look at the code, pre-patch:

> 	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
>                 page = __first_valid_page(pfn, pageblock_nr_pages);
>                 if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
>                         break;
>         }
>         if (pfn < end_pfn)
>                 return -EBUSY;

We have two ways out of the loop:
1. 'page' is valid, and not isolated, so we did a 'break'
2. No page hit (1) in the range and we broke out of the loop because
   of the for() condition: (pfn < end_pfn).  

So, when the condition happens that you mentioned in your changelog
above: "pfn then points to the first pfn after end_pfn", we jump out at
the 'return -EBUSY;'.  We don't ever do pfn_to_page() in that case since
we've returned befoer.

Either 'page' is valid *OR* you return -EBUSY.  I don't think you need
to check both.

> Using the last valid page that was found inside the for() loop, instead
> of pfn_to_page(), should fix this.
> @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> 		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> 			break;
> 	}
> -	if (pfn < end_pfn)
> +	if ((pfn < end_pfn) || !page)
> 		return -EBUSY;
> 	/* Check all pages are free or Marked as ISOLATED */
> -	zone = page_zone(pfn_to_page(pfn));
> +	zone = page_zone(page);

I think this patch fixes the bug, but for reasons other than what you
said. :)

The trouble here is that the 'pfn' could have been in the middle of a
hole somewhere, which __first_valid_page() worked around.  Since you
saved off the result of __first_valid_page(), it ends up being OK with
your patch.

Instead of using pfn_to_page() you could also have just called
__first_valid_page() again.  But, that would have duplicated a bit of
work, even though not much in practice because the caches are still hot.

Technically, you wouldn't even need to check the return from
__first_valid_page() since you know it has a valid result because you
made the exact same call a moment before.

Anyway, can you remove the !page check, fix up the changelog and resend?

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ