lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 03 Jul 2009 18:19:37 +0530
From:	Jaswinder Singh Rajput <jaswinder@...nel.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Arjan van de Ven <arjan@...radead.org>,
	Paul Mackerras <paulus@...ba.org>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Anton Blanchard <anton@...ba.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	x86 maintainers <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>
Subject: Re: [PATCH 1/2 -tip] perf_counter: Add generalized hardware
 vectored co-processor support for AMD and Intel Corei7/Nehalem

On Fri, 2009-07-03 at 17:25 +0530, Jaswinder Singh Rajput wrote:
> On Fri, 2009-07-03 at 12:29 +0200, Ingo Molnar wrote:
> > * Jaswinder Singh Rajput <jaswinder@...nel.org> wrote:
> > 
> > >  Performance counter stats for '/usr/bin/rhythmbox /home/jaswinder/Music/singhiskinng.mp3':
> > > 
> > >        17552264  vec-adds                  (scaled from 66.28%)
> > >        19715258  vec-muls                  (scaled from 66.63%)
> > >        15862733  vec-divs                  (scaled from 66.82%)
> > >     23735187095  vec-idle-cycles           (scaled from 66.89%)
> > >        11353159  vec-stall-cycles          (scaled from 66.90%)
> > >        36628571  vec-ops                   (scaled from 66.48%)
> > 
> > Is stall-cycles equivalent to busy-cycles? 
> 
> 
> hmm, normally we can use these terms interchangeably. But they can be
> different some times.
> 
> busy means it is already executing some instructions so it will not take
> another instruction.
> 
> stall can be busy(executing) or non-executing may be it is waiting for
> some operands due to cache miss.
> 
> 
> > I.e. do we have this 
> > general relationship to the cycle event:
> > 
> > 	cycles = vec-stall-cycles + vec-idle-cycles
> > 
> > ?

Like on AMD :

    13390918485  vec-adds                  (scaled from 57.07%)
    22465091289  vec-muls                  (scaled from 57.22%)
     2643789384  vec-divs                  (scaled from 57.21%)
    17922784596  vec-idle-cycles           (scaled from 57.23%)
     6402888606  vec-stall-cycles          (scaled from 57.17%)
    55823491597  cycles                    (scaled from 57.05%)
    51035264218  vec-ops                   (scaled from 57.05%)

  187.494664172  seconds time elapsed

vec-idle-cycles + vec-stall-cycles = 24325673202

so cycles = 2.29 * (vec-idle-cycles + vec-stall-cycles)

On AMD I used : EventSelect 0D7h Dispatch Stall for FPU Full
The number of processor cycles the decoder is stalled because the
scheduler for the Floating Point Unit is full. This condition can be
caused by a lack of parallelism in FP-intensive code, or by cache misses
on FP operand loads (which could also show up as EventSelect 0D8h
instead, depending on the nature of the instruction sequences). May
occur simultaneously with certain other stall conditions; see
EventSelect 0D1h

So stall is due to lack of parallelism and cache misses.
If we keep on increasing the size of FP units and cache may at some
point be we can get vec-stall-cycles = zero.

Thanks,
--
JSR

http://userweb.kernel.org/~jaswinder/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ