linux-kernel - Re: 2.6.37 kernel warning in perf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110211004232.GA22440@ghostprotocols.net>
Date:	Thu, 10 Feb 2011 22:42:32 -0200
From:	Arnaldo Carvalho de Melo <acme@...hat.com>
To:	Arun Sharma <arun@...rma-home.net>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org
Subject: Re: 2.6.37 kernel warning in perf_events code

Em Thu, Feb 10, 2011 at 12:46:07PM -0800, Arun Sharma escreveu:
> On Thu, Feb 10, 2011 at 12:20 PM, Arnaldo Carvalho de Melo wrote:
> >> >> perf record -g -p <pid> cs -o csw.data -- sleep 3

> > Arun, are you shure the above line is right? I guess it should read:

> > perf record -g -p <pid> -e cs -o csw.data -- sleep 3

> > To specify the context switches soft event, right?
> 
> You caught a cut and paste error. I'm pretty sure I had the -e in
> there when the warning triggered. I tried this command a few times,
> just to verify and here's what I found:

> * Under low loads, everything works fine.
> * Under a heavy work load - I'm not able to reproduce the warning, but
> hitting very similar symptoms:

> [ perf record: Captured and wrote 2.282 MB /tmp/junk.data (~99721 samples) ]
> [ perf record: Captured and wrote 1.734 MB /tmp/junk.data (~75740 samples) ]
> [ perf record: Captured and wrote 0.091 MB /tmp/junk.data (~3975
> samples) ]  <--- bad run

> The bad run made my shell unresponsive and took around 30-40 seconds
> to complete (whereas the good runs completed in less than 5 secs).
> Could this be some kind of a feedback loop where what the measurement
> machinery is perturbing what's being measured?

Is it possible for you to test this with 2.6.38-rc4? At least the user
level tools, just do:

[acme@...icio linux]$ make help | grep perf
  perf-tar-src-pkg    - Build perf-2.6.38-rc3.tar source tarball
  perf-targz-src-pkg  - Build perf-2.6.38-rc3.tar.gz source tarball
  perf-tarbz2-src-pkg - Build perf-2.6.38-rc3.tar.bz2 source tarball
[acme@...icio linux]$

Pick one of these targets on the source tree for 2.6.38-rc4, move the
tarball to the machine where you need to run the older kernel (.37,
right?) and try building and running it there.

Either or just build the new tools and run it on the older kernel.

There were several changes to better inform about lost events due to
heavy load that may be at play in your case.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/