lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FA11CC7.5040302@intel.com>
Date:	Wed, 02 May 2012 19:38:47 +0800
From:	Alex Shi <alex.shi@...el.com>
To:	Borislav Petkov <bp@...64.org>
CC:	andi.kleen@...el.com, tim.c.chen@...ux.intel.com, jeremy@...p.org,
	chrisw@...s-sol.org, akataria@...are.com, tglx@...utronix.de,
	mingo@...hat.com, hpa@...or.com, rostedt@...dmis.org,
	fweisbec@...il.com, riel@...hat.com, luto@....edu, avi@...hat.com,
	len.brown@...el.com, paul.gortmaker@...driver.com,
	dhowells@...hat.com, fenghua.yu@...el.com, yinghai@...nel.org,
	cpw@....com, steiner@....com, linux-kernel@...r.kernel.org,
	yongjie.ren@...el.com
Subject: Re: [PATCH 2/3] x86/flush_tlb: try flush_tlb_single one by one in
 flush_tlb_range

On 05/02/2012 05:38 PM, Borislav Petkov wrote:

> On Wed, May 02, 2012 at 05:24:09PM +0800, Alex Shi wrote:
>> For some of scenario, above equation can be modified as:
>> (512 - X) * 100ns(assumed TLB refill cost) = X * 140ns(assumed invlpg cost)
>>
>> When thread number less than cpu numbers, balance point can up to 1/2
>> TLB entries.
>>
>> When thread number is equal to cpu number with HT, on our SNB EP
>> machine, the balance point is 1/16 TLB entries, on NHM EP machine,
>> balance at 1/32. So, need to change FLUSHALL_BAR to 32.
> 
> Are you saying you want to have this setting per family?


Set it according to CPU type is more precise, but looks ugly. I am
wondering if it worth to do. Maybe conservative selection is acceptable?

>

> Also, have you run your patches with other benchmarks beside your
> microbenchmark, say kernbench, SPEC<something>, i.e. some other
> multithreaded benchmark touching shared memory? Are you seeing any
> improvement there?


I tested oltp reading and specjbb2005 with openjdk. They should not much
flush_tlb_range calling. So, no clear improvement.
Do you know benchmarks which cause enough flush_tlb_range?

> 
>> when thread number is bigger than cpu number, context switch eat all
>> improvement. the memory access latency is same as unpatched kernel.
> 
> Also, how do you know in the kernel that the thread number is the number
> of all threads touching this shared mmapped region - there could be
> unrelated threads doing something else.


Believe we didn't need to know this, much more thread number just weaken
and cover the improvement. When the thread number goes down, the
performance gain appears. So, don't need care this.

Any more comments for this patchset?

> 
> Thanks.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ