lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <878tvblbhq.fsf@e105922-lin.cambridge.arm.com>
Date:   Thu, 01 Sep 2016 19:29:37 +0100
From:   Punit Agrawal <punit.agrawal@....com>
To:     Will Deacon <will.deacon@....com>
Cc:     kvm@...r.kernel.org, Marc Zyngier <marc.zyngier@....com>,
        linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>, kvmarm@...ts.cs.columbia.edu,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions

Will Deacon <will.deacon@....com> writes:

> On Fri, Aug 26, 2016 at 10:37:08AM +0100, Punit Agrawal wrote:
>> > Will Deacon <will.deacon@....com> writes:
>> >> The easiest thing to do is just TLBI VMALLE1IS for all trapped operations,
>> >> but you might want to see how that performs.
>> >
>> > That sounds reasonable for correctness. But I suspect we'll have to do
>> > more to claw back some performance. Let me run a few tests and come back
>> > on this.
>> 
>> Assuming I've correctly switched in TCR and replacing the various TLB
>> operations in this patch with TLBI VMALLE1IS, there is a drop in kernel
>> build times of ~5% (384s vs 363s).
>
> What do you mean by "switched in TCR"? Why is that necessary if you just
> nuke the whole thing?

You're right. it's not necessary. I'd misunderstood how TCR affects
things and was switching it in the above tests.

> Is the ~5% relative to no trapping at all, or
> trapping, but being selective about the operation?

The reported number was relative to trapping and being selective about
the operation. But I hadn't been careful in ensuring identical
conditions (page caches, etc.) when running the numbers.

So I've done a fresh set of identical measurements by running "time make
-j 7" in a VM booted with 7 vcpus and see the following results

1. no trapping ~ 365s
2. traps using selective tlb operations ~ 371s
3. traps that nuke all stage 1 (tlbi vmalle1is) ~ 393s

So based on these measurements there is ~1% and ~7.5% drop in comparison
between 2. and 3. compared to the base case of no trapping at all.

Thanks,
Punit

>
> Will
> _______________________________________________
> kvmarm mailing list
> kvmarm@...ts.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ