lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Dec 2020 16:16:35 +0000
From:   Rahul Gopakumar <gopakumarr@...are.com>
To:     "bhe@...hat.com" <bhe@...hat.com>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "natechancellor@...il.com" <natechancellor@...il.com>,
        "ndesaulniers@...gle.com" <ndesaulniers@...gle.com>,
        "clang-built-linux@...glegroups.com" 
        <clang-built-linux@...glegroups.com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        Rajender M <manir@...are.com>,
        Yiu Cho Lau <lauyiuch@...are.com>,
        Peter Jonasson <pjonasson@...are.com>,
        Venkatesh Rajaram <rajaramv@...are.com>,
        Mike Rapoport <rppt@...nel.org>
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

Hi Baoquan,

We re-evaluated your last patch and it seems to be fixing the
initial performance bug reported. During our previous testing,
we did not apply the patch rightly hence it was reporting
some issues. 

Here is the dmesg log confirming no delay in the draft patch.

Vanilla (5.10 rc3)
------------------

[    0.024011] On node 2 totalpages: 89391104
[    0.024012]   Normal zone: 1445888 pages used for memmap
[    0.024012]   Normal zone: 89391104 pages, LIFO batch:63
[    2.054646] ACPI: PM-Timer IO Port: 0x448 --------------> 2 secs delay

Patch
------

[    0.024166] On node 2 totalpages: 89391104
[    0.024167]   Normal zone: 1445888 pages used for memmap
[    0.024167]   Normal zone: 89391104 pages, LIFO batch:63
[    0.026694] ACPI: PM-Timer IO Port: 0x448 --------------> No delay

Attached dmesg logs. Let me know if anything is needed from our end.



From: Rahul Gopakumar <gopakumarr@...are.com>
Sent: 24 November 2020 8:33 PM
To: bhe@...hat.com <bhe@...hat.com>
Cc: linux-mm@...ck.org <linux-mm@...ck.org>; linux-kernel@...r.kernel.org <linux-kernel@...r.kernel.org>; akpm@...ux-foundation.org <akpm@...ux-foundation.org>; natechancellor@...il.com <natechancellor@...il.com>; ndesaulniers@...gle.com <ndesaulniers@...gle.com>; clang-built-linux@...glegroups.com <clang-built-linux@...glegroups.com>; rostedt@...dmis.org <rostedt@...dmis.org>; Rajender M <manir@...are.com>; Yiu Cho Lau <lauyiuch@...are.com>; Peter Jonasson <pjonasson@...are.com>; Venkatesh Rajaram <rajaramv@...are.com>
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel 
 
Hi Baoquan,

We applied the new patch to 5.10 rc3 and tested it. We are still
observing the same page corruption issue which we saw with the
old patch. This is causing 3 secs delay in boot time.

Attached dmesg log from the new patch and also from vanilla
5.10 rc3 kernel.

There are multiple lines like below in the dmesg log of the
new patch.

"BUG: Bad page state in process swapper  pfn:ab08001"

________________________________________
From: bhe@...hat.com <bhe@...hat.com>
Sent: 22 November 2020 6:38 AM
To: Rahul Gopakumar
Cc: linux-mm@...ck.org; linux-kernel@...r.kernel.org; akpm@...ux-foundation.org; natechancellor@...il.com; ndesaulniers@...gle.com; clang-built-linux@...glegroups.com; rostedt@...dmis.org; Rajender M; Yiu Cho Lau; Peter Jonasson; Venkatesh Rajaram
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

On 11/20/20 at 03:11am, Rahul Gopakumar wrote:
> Hi Baoquan,
>
> To which commit should we apply the draft patch. We tried applying
> the patch to the commit 3e4fb4346c781068610d03c12b16c0cfb0fd24a3
> (the one we used for applying the previous patch) but it fails.

I tested on 5.10-rc3+. You can append below change to the old patch in
your testing kernel.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fa6076e1a840..5e5b74e88d69 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -448,6 +448,8 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
        if (end_pfn < pgdat_end_pfn(NODE_DATA(nid)))
                return false;

+       if (NODE_DATA(nid)->first_deferred_pfn != ULONG_MAX)
+               return true;
        /*
         * We start only with one section of pages, more pages are added as
         * needed until the rest of deferred pages are initialized.

Download attachment "patch-dmesg.log" of type "application/octet-stream" (140111 bytes)

Download attachment "vanilla-dmesg.log" of type "application/octet-stream" (140126 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ