lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Mar 2011 21:00:10 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Jack Steiner <steiner@....com>
Cc:	Jan Beulich <JBeulich@...ell.com>, Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...nel.dk>,
	"x86@...nel.org" <x86@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>, tee@....com,
	Nikanth Karthikesan <knikanth@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lock
 if possible


* Jack Steiner <steiner@....com> wrote:

> > 
> > This cacheline bouncing was actually observed and measured
> > on SGI UV systems, but I'm not certain we're permitted to publish
> > that data. I'm copying the two SGI guys who had reported that
> > issue (and the special case fix, which Nikanth simply generalized)
> > to us, for them to decide.
> 
> We frequently run into the cacheline bouncing issues. I don't have
> the data handy that you refer to, but feel free to publish it.

One good way to see cache bounces is to run a misses/accesses ratio profile:

   perf top -e cache-misses -e cache-references --count-filter 10

Note the two events: this runs a 'weighted' profile, you'll see (LLC) 
cache-misses of a function relative to cache-references it does, a 
misses/references ratio in essence.

The --count-filter filters out rare entries. (so that rare functions 
accidentally producing a large ratio do not clutter the output)

For example during a scheduler-intense workload you'll get something like:

   PerfTop:   32652 irqs/sec  kernel:71.2%  exact:  0.0% [cache-misses/cache-references],  (all, 16 CPUs)
-------------------------------------------------------------------------------------------------------

   weight    samples  pcnt function                     DSO
   ______    _______ _____ ____________________________ ____________________

      1.9        606  3.2% irqtime_account_process_tick [kernel.kallsyms]   
      1.6        854  4.4% update_vsyscall              [kernel.kallsyms]   
      1.5        446  2.3% atomic_cmpxchg               [kernel.kallsyms]   
      1.5        758  3.9% tick_do_update_jiffies64     [kernel.kallsyms]   
      1.4        149  0.8% arch_local_irq_save          [kernel.kallsyms]   
      1.3       1524  7.9% do_timer                     [kernel.kallsyms]   
      1.2        215  1.1% clear_page_c                 [kernel.kallsyms]   
      1.2        128  0.7% dso__find_symbol             /home/mingo/bin/perf
      1.0        281  1.5% calc_global_load             [kernel.kallsyms]   
      0.9        560  2.9% profile_tick                 [kernel.kallsyms]   
      0.7        246  1.3% _raw_spin_lock               [kernel.kallsyms]   
      0.6       2523 13.1% current_kernel_time          [kernel.kallsyms]   

This output is very different from a plain cycles (or even cache-misses) 
measured profile and is very good at identifying 'bouncy' cache-miss sources. 

Another good 'view' is store-references against store-misses:

   PerfTop:   29530 irqs/sec  kernel:99.5%  exact:  0.0% [L1-dcache-store-misses/L1-dcache-stores],  (all, 16 CPUs)
-------------------------------------------------------------------------------------------------------

   weight    samples  pcnt function                 DSO
   ______    _______ _____ ________________________ __________________________________

   1271.3       3814  3.2% apic_timer_interrupt     [kernel.kallsyms]                 
    844.0        844  0.7% read_tsc                 [kernel.kallsyms]                 
    615.0        615  0.5% timekeeping_get_ns       [kernel.kallsyms]                 
    520.0        520  0.4% intel_pmu_disable_all    [kernel.kallsyms]                 
    390.0        390  0.3% tick_dev_program_event   [kernel.kallsyms]                 
    308.3       1850  1.5% update_vsyscall          [kernel.kallsyms]                 
    251.7        755  0.6% hrtimer_interrupt        [kernel.kallsyms]                 
    246.0        246  0.2% find_busiest_group       [kernel.kallsyms]                 
    222.7        668  0.6% native_apic_mem_write    [kernel.kallsyms]                 
    149.0        298  0.2% apic_write               [kernel.kallsyms]                 
    137.0        274  0.2% irq_enter                [kernel.kallsyms]                 
    105.0        105  0.1% arch_local_irq_save      [kernel.kallsyms]                 
    101.0        101  0.1% tick_program_event       [kernel.kallsyms]                 
     95.5        191  0.2% ack_APIC_irq             [kernel.kallsyms]           

You might want to experiment around with the events to see which one expresses 
things best for you on the system in question.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ