linux-kernel - Performance regressions in "boot_time" tests in Linux 5.8 Kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <DM6PR05MB52921FF90FA01CC337DD23A1A4080@DM6PR05MB5292.namprd05.prod.outlook.com>
Date:   Fri, 9 Oct 2020 13:15:42 +0000
From:   Rahul Gopakumar <gopakumarr@...are.com>
To:     "bhe@...hat.com" <bhe@...hat.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "natechancellor@...il.com" <natechancellor@...il.com>,
        "ndesaulniers@...gle.com" <ndesaulniers@...gle.com>,
        "clang-built-linux@...glegroups.com" 
        <clang-built-linux@...glegroups.com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        Rajender M <manir@...are.com>,
        Yiu Cho Lau <lauyiuch@...are.com>,
        Peter Jonasson <pjonasson@...are.com>,
        Venkatesh Rajaram <rajaramv@...are.com>
Subject: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

As part of VMware's performance regression testing for Linux Kernel
upstream releases, we identified boot time increase when comparing
Linux 5.8 kernel against Linux 5.7 kernel. Increase in boot time is
noticeable on VM with a **large amount of memory**.

In our test cases, it's noticeable with memory 1TB and more, whereas
there was no major difference noticed in testcases with <1TB.

On bisecting between 5.7 and 5.8, we found the following commit from 
“Baoquan He” to be the cause of boot time increase in big VM test cases.

-------------------------------------

commit 73a6e474cb376921a311786652782155eac2fdf0
Author: Baoquan He <bhe@...hat.com>
Date: Wed Jun 3 15:57:55 2020 -0700

mm: memmap_init: iterate over memblock regions rather that check each PFN

When called during boot the memmap_init_zone() function checks if each PFN
is valid and actually belongs to the node being initialized using
early_pfn_valid() and early_pfn_in_nid().

Each such check may cost up to O(log(n)) where n is the number of memory
banks, so for large amount of memory overall time spent in early_pfn*()
becomes substantial.

-------------------------------------

For boot time test, we used RHEL 8.1 as the guest OS.
VM config is 84 vcpu and 1TB vRAM.

Here are the actual performance numbers.

5.7 GA - 18.17 secs
Baoquan's commit - 21.6 secs (-16% increase in time)

>From dmesg logs, we can see significant time delay around memmap.

Refer below logs.

Good commit

[0.033176] Normal zone: 1445888 pages used for memmap
[0.033176] Normal zone: 89391104 pages, LIFO batch:63
[0.035851] ACPI: PM-Timer IO Port: 0x448

Problem commit

[0.026874] Normal zone: 1445888 pages used for memmap
[0.026875] Normal zone: 89391104 pages, LIFO batch:63
[2.028450] ACPI: PM-Timer IO Port: 0x448

We did some analysis, and it looks like with the problem commit it's
not deferring the memory initialization to a later stage and it's 
initializing the huge chunk of memory in serial - during the boot-up
time.  Whereas with the good commit, it was able to defer the
initialization of the memory when it could be done in parallel.

Rahul Gopakumar
Performance Engineering
VMware, Inc.