linux-kernel - Re: [PATCH 2/6] x86: mm: rip out complicated, out-of-date, buggy TLB flushing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53569ED3.2080206@redhat.com>
Date:	Tue, 22 Apr 2014 12:54:43 -0400
From:	Rik van Riel <riel@...hat.com>
To:	Dave Hansen <dave@...1.net>, x86@...nel.org
CC:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	akpm@...ux-foundation.org, kirill.shutemov@...ux.intel.com,
	mgorman@...e.de, ak@...ux.intel.com, alex.shi@...aro.org,
	dave.hansen@...ux.intel.com
Subject: Re: [PATCH 2/6] x86: mm: rip out complicated, out-of-date, buggy
 TLB flushing

On 04/21/2014 02:24 PM, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@...ux.intel.com>
> 
> I think the flush_tlb_mm_range() code that tries to tune the
> flush sizes based on the CPU needs to get ripped out for
> several reasons:
> 
> 1. It is obviously buggy.  It uses mm->total_vm to judge the
>    task's footprint in the TLB.  It should certainly be using
>    some measure of RSS, *NOT* ->total_vm since only resident
>    memory can populate the TLB.
> 2. Haswell, and several other CPUs are missing from the
>    intel_tlb_flushall_shift_set() function.  Thus, it has been
>    demonstrated to bitrot quickly in practice.
> 3. It is plain wrong in my vm:
> 	[    0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
> 	[    0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
> 	[    0.037444] tlb_flushall_shift: 6
>    Which leads to it to never use invlpg.
> 4. The assumptions about TLB refill costs are wrong:
> 	http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com
>     (more on this in later patches)
> 5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59
>    I believe the sample times were too short.  Running the
>    benchmark in a loop yields times that vary quite a bit.
> 
> Note that this leaves us with a static ceiling of 1 page.  This
> is a conservative, dumb setting, and will be revised in a later
> patch.
> 
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>

Acked-by: Rik van Riel <riel@...hat.com>


-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/