lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 18 Feb 2011 17:26:19 +0100
From:	Jiri Olsa <jolsa@...hat.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	"H. Peter Anvin" <hpa@...or.com>, ananth@...ibm.com,
	davem@...emloft.net, linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH] kprobes - do not allow optimized kprobes in entry code

On Thu, Feb 17, 2011 at 04:11:03PM +0100, Ingo Molnar wrote:
> 
> * Masami Hiramatsu <masami.hiramatsu.pt@...achi.com> wrote:
> 
> > (2011/02/16 2:05), Jiri Olsa wrote:
> > > You can crash the kernel using kprobe tracer by running:
> > > 
> > > echo "p system_call_after_swapgs" > ./kprobe_events
> > > echo 1 > ./events/kprobes/enable
> > > 
> > > The reason is that at the system_call_after_swapgs label, the kernel
> > > stack is not set up. If optimized kprobes are enabled, the user space
> > > stack is being used in this case (see optimized kprobe template) and
> > > this might result in a crash.
> > > 
> > > There are several places like this over the entry code (entry_$BIT).
> > > As it seems there's no any reasonable/maintainable way to disable only
> > > those places where the stack is not ready, I switched off the whole
> > > entry code from kprobe optimizing.
> > 
> > Agreed, and this could be the best way, because kprobes can not
> > know where the kernel stack is ready without this text section.
> 
> The only worry would be that if we move the syscall entry code out of the regular 
> text section fragments the icache layout a tiny bit, possibly hurting performance.
> 
> It's probably not measurable, but we need to measure it:
> 
> Testing could be done of some syscall but also cache-intense workload, like 
> 'hackbench 10', via perf 'stat --repeat 30' and have a very close look at 
> instruction cache eviction differences.
> 
> Perhaps also explicitly enable measure one of these:
> 
>   L1-icache-loads                            [Hardware cache event]
>   L1-icache-load-misses                      [Hardware cache event]
>   L1-icache-prefetches                       [Hardware cache event]
>   L1-icache-prefetch-misses                  [Hardware cache event]
> 
>   iTLB-loads                                 [Hardware cache event]
>   iTLB-load-misses                           [Hardware cache event]
> 
> to see whether there's any statistically significant difference in icache/iTLB 
> evictions, with and without the patch.
> 
> If such stats are included in the changelog - even if just to show that any change 
> is within measurement accuracy, it would make it easier to apply this change.
> 
> Thanks,
> 
> 	Ingo


hi,

I have some results, but need help with interpretation.. ;)

I ran following command (with repeat 100 and 500)

perf stat --repeat 100 -e L1-icache-load -e L1-icache-load-misses -e
L1-icache-prefetches -e L1-icache-prefetch-misses -e iTLB-loads -e
iTLB-load-misses ./hackbench/hackbench 10

I can tell just the obvious:
- the cache load count is higher for the patched kernel,
  but the cache misses count is lower
- patched kernel has also lower count of prefetches,
  other counts are bigger for patched kernel

there's still some variability in counter values each time I run the perf

please let me know what you think, I can run other tests if needed

thanks,
jirka


--------------------------------------------------------------------------
the results for current tip tree are:

 Performance counter stats for './hackbench/hackbench 10' (100 runs):
             
          815008015  L1-icache-loads            ( +-   0.316% )  (scaled from 81.00%)
           26267361  L1-icache-load-misses      ( +-   0.210% )  (scaled from 81.00%)
             204143  L1-icache-prefetches       ( +-   1.291% )  (scaled from 81.01%)
      <not counted>  L1-icache-prefetch-misses
          814902708  iTLB-loads                 ( +-   0.315% )  (scaled from 80.99%)
              82082  iTLB-load-misses           ( +-   0.931% )  (scaled from 80.98%)

        0.205850655  seconds time elapsed   ( +-   0.333% )


 Performance counter stats for './hackbench/hackbench 10' (500 runs):

          817646684  L1-icache-loads            ( +-   0.150% )  (scaled from 80.99%)
           26282174  L1-icache-load-misses      ( +-   0.099% )  (scaled from 81.00%)
             211864  L1-icache-prefetches       ( +-   0.616% )  (scaled from 80.99%)
      <not counted>  L1-icache-prefetch-misses
          817646737  iTLB-loads                 ( +-   0.151% )  (scaled from 80.98%)
              82368  iTLB-load-misses           ( +-   0.451% )  (scaled from 80.98%)

        0.206651959  seconds time elapsed   ( +-   0.152% )



--------------------------------------------------------------------------
the results for tip tree with the patch applied are:


 Performance counter stats for './hackbench/hackbench 10' (100 runs):

          959206624  L1-icache-loads            ( +-   0.320% )  (scaled from 80.98%)
           24322357  L1-icache-load-misses      ( +-   0.334% )  (scaled from 80.93%)
             177970  L1-icache-prefetches       ( +-   1.240% )  (scaled from 80.97%)
      <not counted>  L1-icache-prefetch-misses
          959349089  iTLB-loads                 ( +-   0.320% )  (scaled from 80.93%)
              85535  iTLB-load-misses           ( +-   1.329% )  (scaled from 80.92%)
          
        0.209696972  seconds time elapsed   ( +-   0.352% )
             
      
 Performance counter stats for './hackbench/hackbench 10' (500 runs):

          960162049  L1-icache-loads            ( +-   0.114% )  (scaled from 80.95%)
           24237651  L1-icache-load-misses      ( +-   0.117% )  (scaled from 80.96%)
             179800  L1-icache-prefetches       ( +-   0.530% )  (scaled from 80.95%)
      <not counted>  L1-icache-prefetch-misses
          960352725  iTLB-loads                 ( +-   0.114% )  (scaled from 80.93%)
              84410  iTLB-load-misses           ( +-   0.491% )  (scaled from 80.92%)

        0.210509948  seconds time elapsed   ( +-   0.140% )

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ