lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 31 Aug 2018 21:50:04 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Will Deacon <will.deacon@....com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Benjamin Herrenschmidt <benh@....ibm.com>,
        Catalin Marinas <catalin.marinas@....com>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Subject: Re: [PATCH 00/12] Avoid synchronous TLB invalidation for
 intermediate page-table entries on arm64

On Fri, 31 Aug 2018 12:49:45 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> On Fri, Aug 31, 2018 at 08:32:34PM +1000, Nicholas Piggin wrote:
> > Oh gee, I suppose. powerpc hash is kind of interesting because it's
> > crazy, Aneesh knows that code a lot better than I do. radix modulo
> > some minor details of exact instructions is fairly like x86   
> 
> The whole TLB broadcast vs explicit IPIs is a fairly big difference in
> my book.

That's true I guess. Maybe arm64 is closer.

> Anyway, have you guys tried the explicit IPI approach? Depending on how
> IPIs are routed vs broadcasts it might save a little bus traffic. No
> point in getting all CPUs to process the TLBI when there's only a hand
> full that really need it.

It has been looked at now and again there's a lot of variables to
weigh. And things are also sized and speced to cover various
hypervisors, OSes, hash and radix, etc. This is something we need to
evaluate on radix a bit better.

> 
> OTOH, I suppose the broadcast thing has been optimized to death on the
> hardware side, so who knows..

There are some advantages of doing it in hardware. Also some of doing
IPIs though. The "problem" is actually Linux is well optimised and it
can be hard to notice much impact until you get to big systems. At
least I don't know of any problem workloads outside micro benchmarks or
stress tests.

Thanks,
Nick

Powered by blists - more mailing lists