linux-kernel - Re: [PATCH] perf: optimize clear page in Intel specified model with movq instruction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 9 Sep 2021 18:34:40 +0800
From:   Luming Yu <luming.yu@...il.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Jinhua Wu <wujinhua@...ux.alibaba.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        zelin.deng@...ux.alibaba.com, jiayu.ni@...ux.alibaba.com,
        Andi Kleen <ak@...ux.intel.com>,
        Luming Yu <luming.yu@...el.com>, fan.du@...el.com,
        artie.ding@...ux.alibaba.com, "Luck, Tony" <tony.luck@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        pawan.kumar.gupta@...ux.intel.com,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        ricardo.neri-calderon@...ux.intel.com,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] perf: optimize clear page in Intel specified model with
 movq instruction

On Thu, Sep 9, 2021 at 5:41 PM Borislav Petkov <bp@...en8.de> wrote:
>
> On Thu, Sep 09, 2021 at 04:45:51PM +0800, Jinhua Wu wrote:
> > Clear page is the most time-consuming procedure in page fault handling.
> > Kernel use fast-string instruction to clear page. We found that in specified
> > Intel model such as CPX and ICX, the movq instruction perform much better
> > than fast-string instruction when corresponding page is not in cache.
> > But when the page is in cache, fast string perform better. We show the test
> > result in the following:
>
> What you should do is show the extensive tests you've run with
> real-world benchmarks where you really can show 40% performance
> improvement.
>
> Also, the static branch "approach" you're using ain't gonna happen. If
> anything, another X86_FEATURE_* bit.

do you mean jump label would not be replaced to nop when its key is enabled?
so we could not use it in certain functions?
I don't understand exactly what "ain't  gonna happen"
>
> Good luck.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette