linux-kernel - Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201222122312.GH874@casper.infradead.org>
Date:   Tue, 22 Dec 2020 12:23:12 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Michal Hocko <mhocko@...e.com>,
        Liang Li <liliangleo@...iglobal.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org
Subject: Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO

On Mon, Dec 21, 2020 at 11:25:22AM -0500, Liang Li wrote:
> Creating a VM [64G RAM, 32 CPUs] with GPU passthrough
> =====================================================
> QEMU use 4K pages, THP is off
>                   round1      round2      round3
> w/o this patch:    23.5s       24.7s       24.6s
> w/ this patch:     10.2s       10.3s       11.2s
> 
> QEMU use 4K pages, THP is on
>                   round1      round2      round3
> w/o this patch:    17.9s       14.8s       14.9s
> w/ this patch:     1.9s        1.8s        1.9s
> =====================================================

The cost of zeroing pages has to be paid somewhere.  You've successfully
moved it out of this path that you can measure.  So now you've put it
somewhere that you're not measuring.  Why is this a win?

> Speed up kernel routine
> =======================
> This can’t be guaranteed because we don’t pre zero out all the free pages,
> but is true for most case. It can help to speed up some important system
> call just like fork, which will allocate zero pages for building page
> table. And speed up the process of page fault, especially for huge page
> fault. The POC of Hugetlb free page pre zero out has been done.

Try kernbench with and without your patch.