linux-kernel - Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CALCETrW7Syc6bZTptj2umGugu9CZ56wZkGF4abEwhpBYQgAOqw@mail.gmail.com>
Date:	Tue, 28 Apr 2015 16:49:28 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Rik van Riel <riel@...hat.com>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Dave Hansen <dave.hansen@...el.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, X86 ML <x86@...nel.org>
Subject: Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1)

On Tue, Apr 28, 2015 at 4:38 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Tue, Apr 28, 2015 at 4:23 PM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> I think we can do it without that by keeping the mapping in reverse as
>> I sort of outlined -- for each cpu, store a mapping from mm to pcid.
>> When things fall out of the list, no big deal.
>
> So you do it by just having a per-cpu array of (say, 64 entries), you
> now end up having to search that every time you do a task switch to
> find the asid for the mm. And even then you've limited yourself to
> just six bits, because doing the same for a possible full 12-bit asid
> would not be possible.
>
> It's actually much simpler if you just do it the other way.

I'm unconvinced.  I doubt that trying to keep more than 4-8 PCIDs
alive in a cpu's TLB is ever a win.  After all, the TLB isn't that
big, and, if we're only the 7th most recent mm to have been loaded on
a cpu, I doubt any of our TLB entries are still likely to be there.

Given that, even if we need 16 bytes of generation counter and such in
the per-cpu array, that's at most 128 bytes.  In practice, we really
ought to be able to get it down to closer to 8 bytes with some care or
we could only use 4 PCIDs, at which point the whole per-cpu structure
fits in a single cache line.  We can search it with 4-8 branches and
no additional L1 misses.

Sure, with 64 entries this would be expensive, but I think that's excessive.

Also, this approach keeps the cost of blowing away stale PCIDs when we
need to invalidate a TLB entry on an inactive PCID down to a single
write as opposed to digging through the per-mm array to poke at the
state for each cpu it might be cached in.  But maybe I missed some
trick that avoids needing to do that.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/