lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5359772A.8070108@sr71.net>
Date:	Thu, 24 Apr 2014 13:42:18 -0700
From:	Dave Hansen <dave@...1.net>
To:	Mel Gorman <mgorman@...e.de>
CC:	x86@...nel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	akpm@...ux-foundation.org, kirill.shutemov@...ux.intel.com,
	ak@...ux.intel.com, riel@...hat.com, alex.shi@...aro.org,
	dave.hansen@...ux.intel.com
Subject: Re: [PATCH 4/6] x86: mm: trace tlb flushes

On 04/24/2014 03:14 AM, Mel Gorman wrote:
> On Mon, Apr 21, 2014 at 11:24:25AM -0700, Dave Hansen wrote:
>> @@ -105,9 +108,10 @@ static void flush_tlb_func(void *info)
>>  
>>  	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
>>  	if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK) {
>> -		if (f->flush_end == TLB_FLUSH_ALL)
>> +		if (f->flush_end == TLB_FLUSH_ALL) {
>>  			local_flush_tlb();
>> -		else if (!f->flush_end)
>> +			trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, TLB_FLUSH_ALL);
>> +		} else if (!f->flush_end)
>>  			__flush_tlb_single(f->flush_start);
>>  		else {
>>  			unsigned long addr;
> 
> Why is only the TLB_FLUSH_ALL case traced here and not the single flush
> or range of flushes? __native_flush_tlb_single() doesn't have a trace
> point so I worry we are missing visibility on this part in particular
> this part.
> 
>                         while (addr < f->flush_end) {
>                                 __flush_tlb_single(addr);
>                                 addr += PAGE_SIZE;
>                         }

You're right, I missed that bit.  I've corrected in a later version of
the patch.

>> @@ -152,7 +156,9 @@ void flush_tlb_current_task(void)
>>  	preempt_disable();
>>  
>>  	count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
>> +	trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
>>  	local_flush_tlb();
>> +	trace_tlb_flush(TLB_LOCAL_SHOOTDOWN_DONE, TLB_FLUSH_ALL);
>>  	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
>>  		flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
>>  	preempt_enable();
> 
> Are the two tracepoints really useful? Are they fine enough to measure
> the cost of the TLB flush? It misses the refill obviously but not much
> we can do there.

It's fine enough, but I did realize over time that the cost of the
tracepoint is about 3x the cost of a 1-page tlb flush itself, so these
are unusable for detailed measurements.  I'll remove it for now.

>>  #endif /* _LINUX_MM_TYPES_H */
>> diff -puN /dev/null include/trace/events/tlb.h
>> --- /dev/null	2014-04-10 11:28:14.066815724 -0700
>> +++ b/include/trace/events/tlb.h	2014-04-21 11:10:35.529868198 -0700
>> @@ -0,0 +1,37 @@
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM tlb
>> +
>> +#if !defined(_TRACE_TLB_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_TLB_H
>> +
>> +#include <linux/mm_types.h>
>> +#include <linux/tracepoint.h>
>> +
>> +extern const char * const tlb_flush_reason_desc[];
>> +
>> +TRACE_EVENT(tlb_flush,
>> +
>> +	TP_PROTO(int reason, unsigned long pages),
>> +	TP_ARGS(reason, pages),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	  int, reason)
>> +		__field(unsigned long,  pages)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->reason = reason;
>> +		__entry->pages  = pages;
>> +	),
>> +
>> +	TP_printk("pages: %ld reason: %d (%s)",
>> +		__entry->pages,
>> +		__entry->reason,
>> +		tlb_flush_reason_desc[__entry->reason])
>> +);
>> +
> 
> I would also suggest you match the output formatting with writeback.h
> which would look like
> 
> pages:%lu reason:%s
> 
> The raw format should still have the integer while the string formatting
> would have something human readable.

I can do that.  The only bummer with the human-readable strings is
turning them back in to something that the filters can take.  I think
I'll just do:

+       TP_printk("pages:%ld reason:%s (%d)",
+               __entry->pages,
+               __print_symbolic(__entry->reason, TLB_FLUSH_REASON),
+               __entry->reason)
+);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ