lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080813164121.GA5985@alberich.amd.com>
Date:	Wed, 13 Aug 2008 18:41:21 +0200
From:	Andreas Herrmann <andreas.herrmann3@....com>
To:	Johannes Weiner <hannes@...urebad.de>
CC:	Ingo Molnar <mingo@...e.hu>, Nick Piggin <npiggin@...e.de>,
	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] alloc_bootmem_core: fix misaligned allocation of 1G
	page

On Tue, Aug 12, 2008 at 06:58:55PM +0200, Johannes Weiner wrote:
> Andreas Herrmann <andreas.herrmann3@....com> writes:
> > The current code in alloc_bootmem_core is based on changes introduced
> > with commit 5f2809e69c7128f86316048221cf45146f69a4a0 (bootmem: clean
> > up alloc_bootmem_core). But I didn't check whether this commit
> > introduced the problem.
> 
> It did, there were workarounds for the same problem in the earlier code,
> I missed it.
> 
> The misalignment stems from the fact that the alignment requirement is
> wider than the offset-pfn and the starting pfn of the node is not
> aligned itself, correct?

Yes.

> I think, the cleaner fix would be to work with an aligned base pfn to
> begin with, like the following, untested.  What do you think?

This won't (completely) work.

Every time you compute the new alignment for sidx the starting point
(node_min_pfn) must be factored in.

Otherwise the function can't allocate the first possible page. For
example, assuming that

  node_min_pfn = 130000
  align = 0x40000000  (1GByte)

allocating a 1G page on this node will result in

  sidx=0x40000
  min_pfn=0x140000

Both are properly aligned. But the resulting super-page will be at
address 0x180000000 whereas the first possible 1G page would be at
address 0x140000000.

> diff --git a/mm/bootmem.c b/mm/bootmem.c
> index 4af15d0..bee4dfe 100644
> --- a/mm/bootmem.c
> +++ b/mm/bootmem.c

  ...

> @@ -492,8 +493,7 @@ find_block:
>  				PFN_UP(end_off), BOOTMEM_EXCLUSIVE))
>  			BUG();
>  
> -		region = phys_to_virt(PFN_PHYS(bdata->node_min_pfn) +
> -				start_off);
> +		region = phys_to_virt(PFN_PHYS(min_pfn) + start_off);
>  		memset(region, 0, size);
>  		return region;

Oops ...
the returned region doesn't match the reserved one as it still gets
reserved with

          if (__reserve(bdata, PFN_DOWN(start_off) + merge,
                        PFN_UP(end_off), BOOTMEM_EXCLUSIVE))
   
where __reserve() will use bdata->node_min_pfn and not the properly
aligned min_pfn value. Either you have to pass the new min_pfn
value to __reserve() or you have to adapt start_off with another
offset = min_pfn - bdata->node_min_pfn ...



I thought about other solutions like introducing a "base_offset" --
the value needed to align node_min_pfn. But this value must be used
in many places to correctly compute/align sidx etc. and it doesn't
make the code better readable.

Hence I still prefer the patch posted yesterday. I just want to clean
it up somewhat. See attached patch.


Regards,

Andreas
--

alloc_bootmem_core: minor cleanup, use min instead of bdata->node_min_pfn

Signed-off-by: Andreas Herrmann <andreas.herrmann3@....com>
---
 mm/bootmem.c |   20 ++++++++------------
 1 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 9d54244..11ece4b 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -459,9 +459,8 @@ static void * __init alloc_bootmem_core(struct bootmem_data *bdata,
 		unsigned long eidx, i, start_off, end_off;
 find_block:
 		sidx = find_next_zero_bit(bdata->node_bootmem_map,
-					  midx - bdata->node_min_pfn,
-					  sidx - bdata->node_min_pfn);
-		sidx += bdata->node_min_pfn;
+					  midx - min, sidx - min);
+		sidx += min;
 		sidx = ALIGN(sidx, step);
 		eidx = sidx + PFN_UP(size);
 
@@ -469,8 +468,7 @@ find_block:
 			break;
 
 		for (i = sidx; i < eidx; i++)
-			if (test_bit(i - bdata->node_min_pfn,
-				     bdata->node_bootmem_map)) {
+			if (test_bit(i - min, bdata->node_bootmem_map)) {
 				sidx = ALIGN(i, step);
 				if (sidx == i)
 					sidx += step;
@@ -478,17 +476,16 @@ find_block:
 			}
 
 		if (bdata->last_end_off &&
-		    (PFN_DOWN(bdata->last_end_off) + 1) ==
-		    (sidx - bdata->node_min_pfn))
+		    (PFN_DOWN(bdata->last_end_off) + 1) == (sidx - min))
 			start_off = ALIGN(bdata->last_end_off, align);
 		else
-			start_off = PFN_PHYS(sidx - bdata->node_min_pfn);
+			start_off = PFN_PHYS(sidx - min);
 
-		merge = PFN_DOWN(start_off) < (sidx - bdata->node_min_pfn);
+		merge = PFN_DOWN(start_off) < (sidx - min);
 		end_off = start_off + size;
 
 		bdata->last_end_off = end_off;
-		bdata->hint_idx = PFN_UP(end_off + bdata->node_min_pfn);
+		bdata->hint_idx = PFN_UP(end_off + min);
 
 		/*
 		 * Reserve the area now:
@@ -497,8 +494,7 @@ find_block:
 				PFN_UP(end_off), BOOTMEM_EXCLUSIVE))
 			BUG();
 
-		region = phys_to_virt(PFN_PHYS(bdata->node_min_pfn) +
-				start_off);
+		region = phys_to_virt(PFN_PHYS(min) + start_off);
 		memset(region, 0, size);
 		return region;
 	}
-- 
1.5.6.4




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ