lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 28 Mar 2018 08:30:12 +0800
From:   Wei Yang <richard.weiyang@...il.com>
To:     Jia He <hejianet@...il.com>
Cc:     Wei Yang <richard.weiyang@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Mel Gorman <mgorman@...e.de>,
        Will Deacon <will.deacon@....com>,
        Mark Rutland <mark.rutland@....com>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Pavel Tatashin <pasha.tatashin@...cle.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        AKASHI Takahiro <takahiro.akashi@...aro.org>,
        Gioh Kim <gi-oh.kim@...fitbricks.com>,
        Steven Sistare <steven.sistare@...cle.com>,
        Daniel Vacek <neelx@...hat.com>,
        Eugeniu Rosca <erosca@...adit-jv.com>,
        Vlastimil Babka <vbabka@...e.cz>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, James Morse <james.morse@....com>,
        Steve Capper <steve.capper@....com>, x86@...nel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        Philippe Ombredanne <pombredanne@...b.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Kemi Wang <kemi.wang@...el.com>,
        Petr Tesarik <ptesarik@...e.com>,
        YASUAKI ISHIMATSU <yasu.isimatu@...il.com>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Nikolay Borisov <nborisov@...e.com>
Subject: Re: [PATCH v3 0/5] optimize memblock_next_valid_pfn and
 early_pfn_valid

On Tue, Mar 27, 2018 at 03:15:08PM +0800, Jia He wrote:
>
>
>On 3/27/2018 9:02 AM, Wei Yang Wrote:
>> On Sun, Mar 25, 2018 at 08:02:14PM -0700, Jia He wrote:
>> > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> > where possible") tried to optimize the loop in memmap_init_zone(). But
>> > there is still some room for improvement.
>> > 
>> > Patch 1 remain the memblock_next_valid_pfn when CONFIG_HAVE_ARCH_PFN_VALID
>> >         is enabled
>> > Patch 2 optimizes the memblock_next_valid_pfn()
>> > Patch 3~5 optimizes the early_pfn_valid(), I have to split it into parts
>> >         because the changes are located across subsystems.
>> > 
>> > I tested the pfn loop process in memmap_init(), the same as before.
>> > As for the performance improvement, after this set, I can see the time
>> > overhead of memmap_init() is reduced from 41313 us to 24345 us in my
>> > armv8a server(QDF2400 with 96G memory).
>> > 
>> > Attached the memblock region information in my server.
>> > [   86.956758] Zone ranges:
>> > [   86.959452]   DMA      [mem 0x0000000000200000-0x00000000ffffffff]
>> > [   86.966041]   Normal   [mem 0x0000000100000000-0x00000017ffffffff]
>> > [   86.972631] Movable zone start for each node
>> > [   86.977179] Early memory node ranges
>> > [   86.980985]   node   0: [mem 0x0000000000200000-0x000000000021ffff]
>> > [   86.987666]   node   0: [mem 0x0000000000820000-0x000000000307ffff]
>> > [   86.994348]   node   0: [mem 0x0000000003080000-0x000000000308ffff]
>> > [   87.001029]   node   0: [mem 0x0000000003090000-0x00000000031fffff]
>> > [   87.007710]   node   0: [mem 0x0000000003200000-0x00000000033fffff]
>> > [   87.014392]   node   0: [mem 0x0000000003410000-0x000000000563ffff]
>> > [   87.021073]   node   0: [mem 0x0000000005640000-0x000000000567ffff]
>> > [   87.027754]   node   0: [mem 0x0000000005680000-0x00000000056dffff]
>> > [   87.034435]   node   0: [mem 0x00000000056e0000-0x00000000086fffff]
>> > [   87.041117]   node   0: [mem 0x0000000008700000-0x000000000871ffff]
>> > [   87.047798]   node   0: [mem 0x0000000008720000-0x000000000894ffff]
>> > [   87.054479]   node   0: [mem 0x0000000008950000-0x0000000008baffff]
>> > [   87.061161]   node   0: [mem 0x0000000008bb0000-0x0000000008bcffff]
>> > [   87.067842]   node   0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
>> > [   87.074524]   node   0: [mem 0x0000000008c50000-0x0000000008e2ffff]
>> > [   87.081205]   node   0: [mem 0x0000000008e30000-0x0000000008e4ffff]
>> > [   87.087886]   node   0: [mem 0x0000000008e50000-0x0000000008fcffff]
>> > [   87.094568]   node   0: [mem 0x0000000008fd0000-0x000000000910ffff]
>> > [   87.101249]   node   0: [mem 0x0000000009110000-0x00000000092effff]
>> > [   87.107930]   node   0: [mem 0x00000000092f0000-0x000000000930ffff]
>> > [   87.114612]   node   0: [mem 0x0000000009310000-0x000000000963ffff]
>> > [   87.121293]   node   0: [mem 0x0000000009640000-0x000000000e61ffff]
>> > [   87.127975]   node   0: [mem 0x000000000e620000-0x000000000e64ffff]
>> > [   87.134657]   node   0: [mem 0x000000000e650000-0x000000000fffffff]
>> > [   87.141338]   node   0: [mem 0x0000000010800000-0x0000000017feffff]
>> > [   87.148019]   node   0: [mem 0x000000001c000000-0x000000001c00ffff]
>> > [   87.154701]   node   0: [mem 0x000000001c010000-0x000000001c7fffff]
>> > [   87.161383]   node   0: [mem 0x000000001c810000-0x000000007efbffff]
>> > [   87.168064]   node   0: [mem 0x000000007efc0000-0x000000007efdffff]
>> > [   87.174746]   node   0: [mem 0x000000007efe0000-0x000000007efeffff]
>> > [   87.181427]   node   0: [mem 0x000000007eff0000-0x000000007effffff]
>> > [   87.188108]   node   0: [mem 0x000000007f000000-0x00000017ffffffff]
>> Hi, Jia
>> 
>> I haven't taken a deep look into your code, just one curious question on your
>> memory layout.
>> 
>> The log above is printed out in free_area_init_nodes(), which iterates on
>> memblock.memory and prints them. If I am not wrong, memory regions added to
>> memblock.memory are ordered and merged if possible.
>> 
>> While from your log, I see many regions could be merged but are isolated. For
>> example, the last two region:
>> 
>>    node   0: [mem 0x000000007eff0000-0x000000007effffff]
>>    node   0: [mem 0x000000007f000000-0x00000017ffffffff]
>> 
>> So I am curious why they are isolated instead of combined to one.
>> 
>> >From the code, the possible reason is the region's flag differs from each
>> other. If you have time, would you mind taking a look into this?
>> 
>Hi Wei
>I thought these 2 have different flags
>[    0.000000] idx=30,region [7eff0000:10000]flag=4     <--- aka
>MEMBLOCK_NOMAP
>[    0.000000]   node   0: [mem 0x000000007eff0000-0x000000007effffff]
>[    0.000000] idx=31,region [7f000000:81000000]flag=0 <--- aka MEMBLOCK_NONE
>[    0.000000]   node   0: [mem 0x000000007f000000-0x00000017ffffffff]

Thanks.

Hmm, I am not that familiar with those flags, while they look like to indicate
the physical capability of this range.

	MEMBLOCK_NONE		no special
	MEMBLOCK_HOTPLUG	hotplug-able
	MEMBLOCK_MIRROR		high reliable
	MEMBLOCK_NOMAP		no direct map

While these flags are not there when they are first added into the memory
region. When you look at the memblock_add_range(), the last parameter passed
is always 0. This means current several separated ranges reflect the physical
memory capability layout.

Then, why this layout is so scattered? As you can see several ranges are less
than 1M.

If, just my assumption, we could merge some of them, we could have a better
performance. Less ranges, less searching time.

>
>-- 
>Cheers,
>Jia

-- 
Wei Yang
Help you, Help me

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ