linux-kernel - Re: [lkp-robot] [x86/mm] c4c3c3c2d0: will-it-scale.per_process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 17 Oct 2017 14:04:40 +0800
From:   Ye Xiaolong <xiaolong.ye@...el.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Borislav Petkov <bp@...en8.de>, X86 ML <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Markus Trippelsdorf <markus@...ppelsdorf.de>,
        Adam Borowski <kilobyte@...band.pl>,
        Brian Gerst <brgerst@...il.com>,
        Johannes Hirte <johannes.hirte@...enkhaos.de>, LKP <lkp@...org>
Subject: Re: [lkp-robot] [x86/mm] c4c3c3c2d0: will-it-scale.per_process_ops
 -61.0% regression

Hi, Andy

On 10/16, Andy Lutomirski wrote:
>On Mon, Oct 16, 2017 at 3:15 AM, Borislav Petkov <bp@...en8.de> wrote:
>> On Mon, Oct 16, 2017 at 10:39:17AM +0800, kernel test robot wrote:
>>>
>>> Greeting,
>>>
>>> FYI, we noticed a -61.0% regression of will-it-scale.per_process_ops due to commit:
>>>
>>>
>>> commit: c4c3c3c2d00826c88b5c02c20e80704664424b9b ("x86/mm: Flush more aggressively in lazy TLB mode")
>>> url: https://github.com/0day-ci/linux/commits/Borislav-Petkov/x86-mm-Flush-more-aggressively-in-lazy-TLB-mode/20171011-115901
>>>
>>>
>>> in testcase: will-it-scale
>>> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
>>
>> Say what now?
>>
>> This is actually what got applied upstream:
>>
>> b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode")
>>
>> and AFAICT, that machine is BDW and it should have PCID, right?
>>
>> Or wait, that's a guest so PCID is probably not even usable for guests.
>> Or should we disable it in VMs?
>
>PCID works on new versions of KVM, at least, depending on configuration.
>
>On a PCID machine, with this patch applied, we are still switching CR3
>when we go idle (which is presumably what we're hitting here) -- we're
>just not flushing anything.  The main cost seems to come from
>serialization.  On my laptop, I think I measured about 80 ns per
>non-flushing CR3 load if I do it in a loop.  The cost is larger when
>it's not in a loop because the pipeline is fuller, I assume.  We also
>take a bit of a hit because switch_mm is a bit complex.  I have a
>patch here to try to optimize it:
>
>https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/fixes&id=1caf24d080dac8b9f952600d1e91879aa782131c
>
>On a non-PCID machine, this patch will increase IPIs, which doesn't
>seem to be what we're seeing.
>
>The test in question is basically the same thing as a test I ran with
>very little in the way of visible regression.  I'm wondering if the
>real problem is some NUMA oddity.
>
>Xiaolong, can you send us /proc/cpuinfo on this kernel on the test
>machine that's seeing this problem?

The /proc/cpuinfo on tested kernel on the test machine is attached.

Thanks,
Xiaolong
>
>--Andy

View attachment "cpuinfo-c4c3c3c2d" of type "text/plain" (102512 bytes)