lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Jul 2015 12:40:30 +0100
From:	Will Deacon <will.deacon@....com>
To:	Catalin Marinas <catalin.marinas@....com>
Cc:	David Daney <ddaney@...iumnetworks.com>,
	David Daney <david.daney@...ium.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Robert Richter <rrichter@...ium.com>,
	David Daney <ddaney.cavm@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH 3/3] arm64, mm: Use IPIs for TLB invalidation.

On Tue, Jul 14, 2015 at 12:13:42PM +0100, Catalin Marinas wrote:
> BTW, if we do the TLBI deferring to the ASID roll-over event, your
> flush_context() patch to use local TLBI would no longer work. It is
> called from __new_context() when allocating a new ASID, so it needs to
> be broadcast to all the CPUs.

What we can do instead is:

 - Keep track of the CPUs on which an mm has been active
 - Do a local TLBI if only the current CPU is in the list
 - Move to the same ASID allocation algorithm as arch/arm/
 - Change the ASID re-use policy so that we only mark an ASID as free
   if we succeeded in performing a local TLBI, postponing anything else
   until rollover

That should handle the fork() + exec() case nicely, I reckon. I tried
something similar in the past for arch/arm/, but it didn't make a difference
on any of the platforms I have access to (where TLBI traffic was cheap).

It would *really* help if I had some Thunder-X hardware...

> That the munmap case usually. In our tests, we haven't seen large
> ranges, mostly 1-2 4KB pages (especially with kernbench when median file
> size fits in 4KB). Maybe the new batching code for x86 could help ARM as
> well if we implement it. We would still issue TLBIs but it allows us to
> issue a single DSB at the end.

Again, I looked at this in the past but it turns out that the DSB ISHST
needed to publish PTEs tends to sync TLBIs on most cores (even though
it's not an architectural requirement), so postponing the full DSB to
the end didn't help on existing microarchitectures.

Finally, it might be worth dusting off the leaf-only TLBI stuff you
looked at in the past. It doesn't reduce the message traffic, but I can't
see it making things worse.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ