lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 31 Aug 2006 20:08:58 -0700
From:	"Keith Mannthey" <kmannth@...il.com>
To:	"Mel Gorman" <mel@....ul.ie>
Cc:	akpm@...l.org, tony.luck@...el.com,
	"Linux Memory Management List" <linux-mm@...ck.org>, ak@...e.de,
	bob.picco@...com,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	linuxppc-dev@...abs.org
Subject: Re: [PATCH 4/6] Have x86_64 use add_active_range() and free_area_init_nodes

On 8/31/06, Mel Gorman <mel@....ul.ie> wrote:
> On Thu, 31 Aug 2006, Keith Mannthey wrote:
> > On 8/31/06, Mel Gorman <mel@...net.ie> wrote:
> >> On (30/08/06 13:57), Keith Mannthey didst pronounce:
> >> > On 8/21/06, Mel Gorman <mel@....ul.ie> wrote:
> >> > >

> Can you confirm that happens by applying the patch I sent to you and
> checking the output? When the reserve fails, it should print out what
> range it actually checked. I want to be sure it's not checking the
> addresses 0->0x1070000000

See below

> >> > >@@ -329,6 +330,8 @@ acpi_numa_memory_affinity_init(struct ac
> >> > >
> >> > >        printk(KERN_INFO "SRAT: Node %u PXM %u %Lx-%Lx\n", node, pxm,
> >> > >               nd->start, nd->end);
> >> > >+       e820_register_active_regions(node, nd->start >> PAGE_SHIFT,
> >> > >+                                               nd->end >> PAGE_SHIFT);
> >> >
> >> > A node chunk in this section of code may be a hot-pluggable zone. With
> >> > MEMORY_HOTPLUG_SPARSE we don't want to register these regions.
> >> >
> >>
> >> The ranges should not get registered as active memory by
> >> e820_register_active_regions() unless they are marked E820_RAM. My
> >> understanding is that the regions for hotadd would be marked "reserved"
> >> in the e820 map. Is that wrong?
> >
> > This is wrong.  In a mult-node system that last node add area will not
> > be marked reserved by the e820.  The e820 only defines memory <
> > end_pfn. the last node add area is > end_pfn.
> >
>
> ok, that should still be fine. As long as the ranges are not marked
> "usable", add_active_range() will not be called and the holes should be
> counted correctly with the patch I sent you.
>
> > With RESERVE based add-memory you want the add-areas repored by the
> > srat to be setup during boot like all the other pages.
> >
>
> So, do you actally expect a lot of unused mem_map to be allocated with
> struct pages that are inactive until memory is hot-added in an
> x86_64-specific manner? The arch-independent stuff currently will not do
> that. It sets up memmap for where memory really exists. If that is not
> what you expect, it will hit issues at hotadd time which is not the
> current issue but one that can be fixed.

Yes. RESERVED based is a big waste of mem_map space.  The add areas
are marked as RESERVED during boot and then later onlined during add.
 It might be ok.  I will play with tomorrow.  I might just need to
call add_active_range in the right spot :)

> >> > >        if (ma->flags.hot_pluggable && !reserve_hotadd(node, start, end)
> >> <
> >> > >        0) {
> >> > >                /* Ignore hotadd region. Undo damage */
> >> >
> >> >  I have but the e820_register_active_regions as a else to this
> >> > statment the absent pages check fails.
> >> >
> >>
> >> The patch below omits this change because I think
> >> e820_register_active_regions() will still have got called by the time
> >> you encounter a hotplug area.
> >
> > called but then removed in setup arch.
>
> By "removed", I assume you mean the active regions removed by the call
> to remove_all_active_regions() in setup_arch(). Before reserve_hotadd() is
> called, e820_register_active_regions() will have reregistered the active
> regions with the NUMA node id.

I see e820_register_active_regions is acting as a filter against the e820

> >> > Also nodes_cover_memory and alot of these check were based against
> >> > comparing the srat data against the e820.  Now all this code is
> >> > comparing SRAT against SRAT....
> >> >
> >>
> >> I don't see why. The SRAT table passes a range to
> >> e820_register_active_regions() so should be comparing SRAT to e820
> >
> > let me go off and look at e820_register_active_regions() some more.
Things get clear :)

Should be ok.

> > Sure thing.  It is just the hot-add area I am guessing it is an off by
> > one error of some sort.
> >
See below. I do my e820_register_active_area as an else to to if
(hotplug.....!reserve) and the prink is easy to sort out.

I see your pfn are in base 10.  Looks like it considers the last
addres to be a present page. (off by one thing).

Thanks,
  Keith

Output below
disabling early console
Linux version 2.6.18-rc4-mm3-smp (root@...3a153) (gcc version 4.1.0
(SUSE Linux)) #6 SMP Thu Aug 31 22:06:00 EDT 2006
Command line: root=/dev/sda3
ip=9.47.66.153:9.47.66.169:9.47.66.1:255.255.255.0 resume=/dev/sda2
showopts earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0
debug numa=hotadd=100
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 0000000000098400 (usable)
 BIOS-e820: 0000000000098400 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000007ff85e00 (usable)
 BIOS-e820: 000000007ff85e00 - 000000007ff98880 (ACPI data)
 BIOS-e820: 000000007ff98880 - 0000000080000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000470000000 (usable)
 BIOS-e820: 0000001070000000 - 0000001160000000 (usable)
Entering add_active_range(0, 0, 152) 0 entries of 3200 used
Entering add_active_range(0, 256, 524165) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 4653056) 2 entries of 3200 used
Entering add_active_range(0, 17235968, 18219008) 3 entries of 3200 used
end_pfn_map = 18219008
DMI 2.3 present.
ACPI: RSDP (v000 IBM                                   ) @ 0x00000000000fdcf0
ACPI: RSDT (v001 IBM    EXA01ZEU 0x00001000 IBM  0x45444f43) @
0x000000007ff98800
ACPI: FADT (v001 IBM    EXA01ZEU 0x00001000 IBM  0x45444f43) @
0x000000007ff98780
ACPI: MADT (v001 IBM    EXA01ZEU 0x00001000 IBM  0x45444f43) @
0x000000007ff98600
ACPI: SRAT (v001 IBM    EXA01ZEU 0x00001000 IBM  0x45444f43) @
0x000000007ff983c0
ACPI: HPET (v001 IBM    EXA01ZEU 0x00001000 IBM  0x45444f43) @
0x000000007ff98380
ACPI: SSDT (v001 IBM    VIGSSDT0 0x00001000 INTL 0x20030122) @
0x000000007ff90780
ACPI: SSDT (v001 IBM    VIGSSDT1 0x00001000 INTL 0x20030122) @
0x000000007ff88bc0
ACPI: DSDT (v001 IBM    EXA01ZEU 0x00001000 INTL 0x20030122) @
0x0000000000000000
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 0 -> APIC 2 -> Node 0
SRAT: PXM 0 -> APIC 3 -> Node 0
SRAT: PXM 0 -> APIC 38 -> Node 0
SRAT: PXM 0 -> APIC 39 -> Node 0
SRAT: PXM 0 -> APIC 36 -> Node 0
SRAT: PXM 0 -> APIC 37 -> Node 0
SRAT: PXM 1 -> APIC 64 -> Node 1
SRAT: PXM 1 -> APIC 65 -> Node 1
SRAT: PXM 1 -> APIC 66 -> Node 1
SRAT: PXM 1 -> APIC 67 -> Node 1
SRAT: PXM 1 -> APIC 102 -> Node 1
SRAT: PXM 1 -> APIC 103 -> Node 1
SRAT: PXM 1 -> APIC 100 -> Node 1
SRAT: PXM 1 -> APIC 101 -> Node 1
SRAT: Node 0 PXM 0 0-80000000
Entering add_active_range(0, 0, 152) 0 entries of 3200 used
Entering add_active_range(0, 256, 524165) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-470000000
Entering add_active_range(0, 0, 152) 2 entries of 3200 used
Entering add_active_range(0, 256, 524165) 2 entries of 3200 used
Entering add_active_range(0, 1048576, 4653056) 2 entries of 3200 used
SRAT: Node 0 PXM 0 0-1070000000
reserve_hotadd called with node 0 sart 470000000 end 1070000000
SRAT: Hotplug area has existing memory
Entering add_active_range(0, 0, 152) 3 entries of 3200 used
Entering add_active_range(0, 256, 524165) 3 entries of 3200 used
Entering add_active_range(0, 1048576, 4653056) 3 entries of 3200 used
SRAT: Node 1 PXM 1 1070000000-1160000000
Entering add_active_range(1, 17235968, 18219008) 3 entries of 3200 used
SRAT: Node 1 PXM 1 1070000000-3200000000
reserve_hotadd called with node 1 sart 1160000000 end 3200000000
SRAT: Hotplug area has existing memory
Entering add_active_range(1, 17235968, 18219008) 4 entries of 3200 used
NUMA: Using 28 for the hash shift.
Bootmem setup node 0 0000000000000000-0000001070000000
Bootmem setup node 1 0000001070000000-0000001160000000
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 -> 18219008
early_node_map[4] active PFN ranges
    0:        0 ->      152
    0:      256 ->   524165
    0:  1048576 ->  4653056
    1: 17235968 -> 18219008
On node 0 totalpages: 4128541
0 pages used for SPARSE memmap
1149 pages DMA reserved
  DMA zone: 2843 pages, LIFO batch:0
0 pages used for SPARSE memmap
  DMA32 zone: 520069 pages, LIFO batch:31
0 pages used for SPARSE memmap
  Normal zone: 3604480 pages, LIFO batch:31
On node 1 totalpages: 983040
0 pages used for SPARSE memmap
0 pages used for SPARSE memmap
0 pages used for SPARSE memmap
  Normal zone: 983040 pages, LIFO batch:31
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ