linux-kernel - Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <22437770-956e-f7b4-a8f6-3f1cc28c3ec2@redhat.com>
Date:   Mon, 15 Mar 2021 14:04:09 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Mike Rapoport <rppt@...ux.ibm.com>
Cc:     "Liang, Liang (Leo)" <Liang.Liang@....com>,
        "Deucher, Alexander" <Alexander.Deucher@....com>,
        linux-kernel@...r.kernel.org,
        amd-gfx list <amd-gfx@...ts.freedesktop.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Huang, Ray" <Ray.Huang@....com>,
        "Koenig, Christian" <Christian.Koenig@....com>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        George Kennedy <george.kennedy@...cle.com>
Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail
 in __free_pages_core()")

On 13.03.21 14:48, Mike Rapoport wrote:
> Hi,
> 
> On Sat, Mar 13, 2021 at 10:05:23AM +0100, David Hildenbrand wrote:
>>> Am 13.03.2021 um 05:04 schrieb Liang, Liang (Leo) <Liang.Liang@....com>:
>>>
>>> Hi David,
>>>
>>> Which benchmark tool you prefer? Memtest86+ or else?
>>
>> Hi Leo,
>>
>> I think you want something that runs under Linux natively.
>>
>> I‘m planning on coding up a kernel module to walk all 4MB pages in the
>> freelists and perform a stream benchmark individually. Then we might be
>> able to identify the problematic range - if there is a problematic range :)
> 
> My wild guess would be that the pages that are now at the head of free
> lists have wrong caching enabled. Might be worth checking in your test
> module.

I hacked something up real quick:

https://github.com/davidhildenbrand/kstream

Only briefly tested inside a VM. The output looks something like

[...]
[ 8396.432225] [0x0000000045800000 - 0x0000000045bfffff] 25322 MB/s / 
38948 MB/s
[ 8396.448749] [0x0000000045c00000 - 0x0000000045ffffff] 24481 MB/s / 
38946 MB/s
[ 8396.465197] [0x0000000046000000 - 0x00000000463fffff] 24892 MB/s / 
39170 MB/s
[ 8396.481552] [0x0000000046400000 - 0x00000000467fffff] 25222 MB/s / 
39156 MB/s
[ 8396.498012] [0x0000000046800000 - 0x0000000046bfffff] 24416 MB/s / 
39159 MB/s
[ 8396.514397] [0x0000000046c00000 - 0x0000000046ffffff] 25469 MB/s / 
38940 MB/s
[ 8396.530849] [0x0000000047000000 - 0x00000000473fffff] 24885 MB/s / 
38734 MB/s
[ 8396.547195] [0x0000000047400000 - 0x00000000477fffff] 25458 MB/s / 
38941 MB/s
[...]

The benchmark allocates one 4 MiB chunk at a time and runs a simplified 
STREAM benchmark a) without flushing caches b) flushing caches before 
every memory access.

It would be great if you could run that with the *old behavior* kernel 
(IOW, without 7fef431be9c9), so we might still be lucky to catch the 
problematic area in the freelist.

Let's see if that will indicate anything.

-- 
Thanks,

David / dhildenb